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APPENDIX A 




The following sections of code accomplish two tasks: %^ 

I) Calculation of the topomeric conformation for a particular 
molecule, assuming that the molecule is referenced by a particular 
row of a Tripos Molecular Spreadsheet (MSS). With minor 
adaptations this code could be used in other molecular modeling 
environments, such as Cerius 2, Quanta, or Insight. 

II) Calculation of the line sOiope assuming that the biological 
data and one or more columns of property data are stored in a Tripos 
Molecular Spreadsheet (MSS). Almost any other software for 
manipulating data in a spreadsheet or other tabular representation 
could be adapted to perform similar calculations, assuming a 
Tanimoto function for expressing "distances" between bitsets of equal 
cardinality. 

Both sections of code include procedures written in two languages. 
The first is C, familiar to all programmers, and includes both all 
specialized structure declarations and also brief explanations of all 
functions used. The second is SPL, an interpretative language 
available within the SYBYL molecular modeling program, whose 
syntax is similar to a Unix shell script. The SPL language is described 
fully in the volume entitled SPL Manual , found within the 
documentation set for SYBYL 6.2, release date July 1995. This volume 
includes descriptions of all "expression generators" (functions 
returning a value) and "macro commands" not specifically explained 
below. 

1. Topomeric Field Code: 

A. SPL macro CH0MIBUILD3D. To build topomerically aligned 
3D models, the third argument must have the value ALIGN, and the 
global associative array element CHOM!Align[ALICYC] must have 
the value All_trans. Code to allow user adjustment of these and 
other 3D model-building parameters appearing in this code as other 
elements of CHOM!Align[] is not shown. 

B. Under these circumstances the following SPL macro 
GHOMlAlltrans sets all torsions provided to their topomeric values. 

C. To determine t4ie atoms defining each torsion to be adjusted, 
CHOMIAlltrans invokes the expression generator %tr-ans_path(), 
which executes the following C subroutine SYB^MGEN^CONN^BEST, 
with its associated subroutines syb_mgen_conn_att_atoms , 
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g€t_path_mw, get_path_xyz, and (if debugging) ashow. No user- 
adjustable values are used by this code. All non-obvious include files 
and a brief functional description of subroutines external to this code 
are provided in section III below. 

D. The computation of rotatable-bond-attenuated steric (and/or 
electrostatic, hydrogen bonding) fields for the topomerically aligned 
conformation is carried out by the C subroutine 
QSAR_FIELD_EVAL_RB_ATTEN, which uses the accompanying 
subroutine QSAR_FIELD_RB_WTS to generate an attenuated weight 
for each atom's contribution to the field(s). (Pseudo code for the 
latter subroutine appears in its header comment.) The attenuation 
factor (recommended value of 0.85) is a user-adjustable or 
"tailorable" value, here shown as COMFA!AGGREG_SCALING. The 
user-adjustable HBOND_RAD_SCALING parameter affects the steric 
"radius" of a hydrogen-bonding hydrogen. 

IL Patterson-Distribution Validation Code 

A. The SPL expression generator irtj'ast returns the slope of 
the "best" line along with the count of data points and the 
fractional area, within a "virtual" or conceptual graph of absolute 
differences in biological activities vs absolute differences in the 
diversity measurement to be validated. The format of its output 
appears in the header comment. 

B. The short SPL expression generator dochi shows the 
computation ' of the chi-squared statistic resulting from the output of 
the lrt_fast expression generator. 

C. The C code functions QSHELL_HIER_LRT, 
QSHELL_HIERJDOJLRT, mdfpt_heapsort generate the results 
produced by lrt_fast. These routines generate the biological 
differences themselves but rely on some external procedure, not 
shown, to generate the distances between the diversity 
measurements. (The reason is that the method of calculating 
differences depends on the diversity parameter(s). Typically a 
Euclidean distance is calculated for scalar properties, or a Tanimoto 
difference is calculated for bitsets, and if multiple parameters are 
combined to form the diversity measurement to be validated then 
the relative weighting must also be specified by the user.) 

Section III. Supporting information for interpretation of the C code in 
Sections I and IL 
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A. Declarations of complex and non-standard data structures 
referenced by the declarations within these C procedures, specifically 
for molecules, atoms, and the regions, fields, and other user input 
information that are part of a CoMFA field description. 

B. Functional descriptions of all external subroutines called by 
these C procedures, ordered alphabetically. 



# SECTION I-A. Macro B?nLD_3D for generating and storing topomeric alignments 
# 

©macro BUILD__3D CHOM 

# builds 3D models, 

# storage in a database or in a conformer column 
# 

# either not-aligned (just uses Concord or as-is if from Unity, 

# or minimizes input structure) 

# or aligned for CoMFA (requires core structure as alignment template) 

# with optional fixup of side chains, charge calculation 

# $1 is row ids in current MSS 

# $2 is storage code (will retrieve structure from same place or somewhere) 

# $3 is align (U or A) 

# $4 is basic building technique 

# other arguments, used only if ALIGN is true, are elements 

# of the global associative array CHOM! ALIGN 

# set up mol retrieval from MSS to be fast and clean 

localvar AFFECT_SUBSET_save 
localvar EXAMINE_TAILOR_MODE__save 
locllvar KIGKLIGHT_MSS_save 
locai3.var INFORM_save 
locarlvar INPUT_MODE_save 
locMvar RELATE_save 
lociDLvar SHOW_MOLECULE_save 

loca4var USER_FUNCTION_save natmcore heavy ys 
lociavar align ma rid cgq_save tailor_bumps_save newc \ 
m a b ma:x^save usehs rat yrat nrat noth 

set^r AFFECT_SUBSET_save $TAIL0R1 EXAMINE !AFFECT_SUBSET 
setmr EXAMINE__TAILOR_MODE_save $ TAI LOR 1 EXAMINE ! EXAMINE_TAILOR_MODE 

setter HIGHLIGHT__MSS_save $TAILOR ! EXAMINE I HIGHLIGHT_MSS 

setyi-r INFORM_save $TAILOR! EXAMINE! INFORM 

sett|.r INPUT_MODE_save $TAILOR ! EXAMINE I INPUT_MODE 

se tfir RELATE_save $TAILOR ! EXAMINE ! RELATE 

setvar SHOW_MOLECULE_save $TAILOR ! EXAMINE ! SHOW_MOLECULE 

se tvar USER_FlMCTION_save $TAILOR ! EXAMINE ! USER_FUNCTION 
setvar cgq_save $CGQ_TIMEOUT 
set CGQ_timeout 0 



setvar TAILOR ! EXAMINE I AFFECT_SUBSET NONE 

setvar TAILOR! EXAMINE !EXAMINE__TAILOR_MODE SILENT 

setvar TAILOR! EXAMINE !HIGHLIGHT_MSS NO 

setvar TAILOR ! EXAMINE ! INFORM NO 

setvar TAILOR i EXAMINE ! INPUT_MODE ROW COLUMN EXPR 

setvar TAILOR 1 EXAMINE 1 RELATE NO ~ 

setvar TAILOR! EXAMINE !SHOW__MOLECULE YES 

setvar TAILOR ! EXAMINE ! USER_FUNCTION ^ NONE 



setvar max_save $TAIL0R!MAXIMIN2 !LS_STEP__SIZE $TAIL0R!MAXIMIN2 !MAXIMDM_ITERATION 
setvar ma %table_attribute ( MOL__AREA ) 

# if needed make new place to put output 
setvar newc 

switch %substr( $213) 
case NEW) 

setvar newc %math ( %table ( * COL COUNT ) + i ) 
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' table column sin %c^V CONF $newc ) 
case SYB) 

database open %qspr_table_db ( %table_default ( ) ) update 
table ATTRIBUTE SET CONFORMER 0 

case ) 

setvar newc %substr( $2 1 %math( %pos ( _ $2 ) - 1 ) 
TABLE CONFORMER $newc 

endswitch 



if %streql( %substr( $311) "A" ) 

# are we bump checking ? 
if $CHOM! Align [BUMPS] 

setvar tailor_b\amps_save $TAILOR! GENERAL lbumps_contact_distance 
tailor set general bumps_contact_distance %math{ $CHOM! Align [BUMPS] - 1.0 ) 
endif 

U 

# STEP 1: prepare template fragment 

m 

setvar mcore $CHOM!Align[ MCORE } 

# save original template 
s^Svar mcsav %molempty() 
cc^ $mcore $mcsav 

d^flault $mcore >$nulldev 
ifiri$CHOMI Align [DEBUG] 

^yiabel id * 

^dif 

^;ftvar capsln %cat ( %sln( $mcore ) ) 
§!etvar natcore %mol_info( $mcore NATOMS ) 

# ifytlie alignment template has just one free valence, 

# m4|e geometrically acceptable template by adding heavy atoms, minimizing 
ft e]||e use as is 

^itvar heavy TRUE 

||llvalence *-H* Hal >$nulldev 

f f %gt ( %math { %mol_inf o ( $mcore NATOMS ) - $natcore ) 1 ) 
copy $mcsav $mcore 
setvar heavy 

endif 
if $heavy 
for a in %atoms (<H*>-<H>) 

modify atom type $a C.3 >$nulldev 
modify atom name $a XI >$nulldev 
endfor 
endif 

TAILOR SET MAXIMIN2 LS_STEP_SIZE 0.0001 MAXIMOM_ITERATIONS 1000 I I 
MAXIMIN $mcore DONE INTERACTIVE >$nulldev 

if $heavy 

for a in %atoms(Xl) 
modify atom type 

# must rename it I ! 
modify atom name 

endfor 

setvar ys %set_create ( %atoms(Xl) } 
#j orient template so that an R points in the positive X direction 

I ■ 



$a HEV >$nulldev 
$a XI >$nulldev 



dj^et_unpack ( $ys ) ) 



setvar rat %arg ( 

setvar nrat %arg{ l^%atom_inf o ( $rat NEIGHBORS ) ) 
setvar yrat %arg ( 1 %set_unpack( %set_dif f ( \ 

%set_create( %atom_info{ $nrat NEIGHBORS ) ) $rat ) ) ) 
: ORIENT USER $nrat $rat $yrat >$nulldev 
endif 

# identify all the non-primary atoms for FIT, in/out of the search pattern 

# and all the basic torsions {bonds to Ys) that potentially need setting 

setvar tpat %arg( 1 %search2d( %cat ( %sln( $mcore ) ) $capsln NoDup 0 y ) ) 
setvar hvinpat 
setvar patats 
setvar tors 
setvar usehs 

setvar sybhvats %set_create (%atoms (*-<H>) ) 
if %lt( %set_size( $sybhvats ) 3 } 
setvar usehs TRUE 

setvar sybhvats %set_create {%atoms (*) ) 

endif 

for a in %range{l %sln_atom_count { $capsln ) ) 

if %or( "$usehs" "%not( %set_and( %sln_atom_syinbol { $capsln $a ) \ 
H,F,Cl,Br,I ) )" ) 

# foO FIT, need to know the SYBYL IDs of the heavy atoms 

J setvar hvinpat $hvinpat $a 

M setvar patats [ $a ] %sln_rgroup___sybid ( $mcore $tpat $a ) 
SJ setvar patats [ $a ] [ YS ] %set_and( "$ys" "%set_create ( \ 
[il %atom_inf o ( $patats [ $a ] NEIGHBORS ) ) " ) 

#: fo^j each torsion root, need to save the SLN ID of an arbitrary 

# O heavy atom torsional def iner 
fO if $patats [ $a ] [ YS ] 

r setvar tors [ $a ] %set_and( %set_dif f ( nset_create ( \ 

^ato|^info( $patats [ $a ] NEIGHBORS ) )" $patats [ $a ] [ YS ] ) $sybhvats ) 

# if^Jhere are several possibilities, prefer the lowest #'d carbon 

# m to define trans -ness 

zJ if %gt( %set_size{ $tors [ $a ] ) 1 ) 

rf if %set_and( $tors [ $a ] %set_create( %atoms(<C*>) ) ) 

setvar tors [ $a ] %set_and{ $tors [ $a ] \ 
%set_create{ %atoms {<€*>) ) ) 

endif 

setvar tors [ $a ] %arg { l %set_unpack( $tors [ $a ] ) ) 

endif 

for al in %range(l %sln_atom_count ( $capsln ) ) 

if %eq( $tors[ $a ] %sln_rgroup_sybid { $mcore $tpat $al ) ) 
setvar tors[$a] $al 
break 
endif 
endf or 
; endif 
endif 
endf or 
if $CHOM! Align [DEBUG] 
:echo %prompt ( INT 1 " " " " ) 
endif 
endif 

default $ma >$nulldev 
setvar CHOMIBadRows 
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## 

# . build 3D models 
## 

# off we go ! ! Get MSS row IDS to build models for 
if %streql ( $1 * ) 

setvar rids %table( * ROW NUM ) 
else 

setvar rids %set_unpack( $1 ) 
endif 

for rid in $rids 

# get the next MSS entry to be modelled 
table examine $rid | | >$nulldev 

# fix N02's (egad what a pain) ^ because Concord & SYBYL are inconsistent 
\ setvar pat %search2d( %sln{ $ma ) N(=0)0 ALL 0 y ) 

■ while $pat 

setvar pat %sln_rgroup_sybid ( $ma %arg ( 1 $pat ) 1 3 } 
modify bond type %bonds ( %cat ( %arg( 1 $pat ) \ 

%arg( 2 $pat ) ) ) 2 >$nulldev 
modify atom type %arg{ 2 $pat ) o*2 
.,.setvar pat %search2d{ %sln{ $ma ) N(=0)0 ALL 0 y ) 
en^hile 

» 

ir^CHOMI Align [DEBUG] 

ikbel id * 
enffiif 

# ba®LC optimization 
swSfltch $4 

case^: CONCORD) 

ODNCORD MOL $ma >$nulldev 

# iffll^oncord failed, we may still be awfully flat 

# miiELmize if there are heavy atoms not part of a single aromatic system 

^ptvar noth %atoms ( *-<H> ) 
^tvar al %arg( 1 $noth ) 

is %set_diff( nset_create{ $noth )" \ 

"%set_create ( %atoms ( %cat ( "{aromaticC "$al" ")}" ) ) )" ) 
setvar zs %extent_3d( %cat ( $ma " (*) " ) 
setvar zs %math( %arg { 5 $zs ) - %arg{ 6 $zs ) ) 
if %eq( $zs 0.0 ) 

%unflatten( %cat ( $ma " (*) « ) ) 
MAXIMIN $ma DONE INTERACTIVE 
endif 
endif 

case MINIMIZE) 

MAXIMIN $ma DONE INTERACTIVE >$nulldev 



; endswitch 

# done, if only 3d coord, but for topomeric CoMFA . . 
; if %streql( %substr( $311) "A" ) 

# find any arbitrary 2D hit 

setvar pat %search2d( %cat ( %sln( $ma ) ) $capsln NoDup 0 y ) 

if %not{ $pat ) 

setvar CHOMlBadRows %set_or( "$CHOM!BadRows" $rid ) 
echo $capsln not found in molecule for Row $rid skipping 
; goto nextl 
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endif 

setvar pat %arg(l"^at ) 
setvar allpatats %set_create{ %sln_rgroup_sybid { $ma $pat \ 
: %range( 1 %sln_atom_count ( $capsln ) ) ) ) 

4 collect all appropriate heavy atoms for FIT and torsions 
setvar matl 
setvar mat2 
setvar schns 
for a in $hvinpat 

setvar matl $matl $patats [ $a ] 

setvar sybat %sln_rgroup_sybid { $ma $pat $a ) 

setvar mat 2 $mat2 $ sybat 

# are there heavy atom neighbors to FIT also (and generate torsion lists) ? 

if $patats [$a] [YS] 

setvar ans %set_diff ( %set_create{ \ 

%atom_info( $sybat NEIGHBORS ) ) $allpatats ) 
setvar ans %atoms {$ans-<H>) 
setvar i 1 

for p in %set_unpack( $patats[$a] [YS] ) 

# add heavy atom neighbors to FIT list 

if %arg( $i $ans ) 

setvar matl $matl $p 

setvar mat2 $mat2 %arg( $i $ans ) 
#; g^ferate another torsion for CHOMIalltrans 

setvar schns $schns %cat ( $sybat \ 
^slrf|rgroup_sybid( $ma $pat $tors [ $a ] ) %" %arg( $i $ans ) ) 

^jJ endif 

O setvar i %math( $i + 1 ) 

|fl endf or 

B endif 
©idf or 

^tvar dofit MZ^TCH %cat ( $mcore "(" %set_create( $matl ) ") " ) \ 

111 %cat ( $ma "(" %set_create ( $mat2 ) ") " ) 

^ofit >$nulldev 
i f $|JI0M I Al ign [DEBUG] 

echo %prompt ( INT 1 " " " " ) 
endif 

# do FIT 

if %gt( $MATCH_RMS $CHOM!Align[ FITRMS ] ) 

setvar CHOMIBadRows %set_or( "$CHOM!BadRows " $rid ) 

echo Bad geometric alignment (MATCH_RMS - $]\1ATCH_RMS ) for Row $rid sk 
goto nextl 
endif 

# side chain alignments . . 

I switch $CHOMI Align [ ALICYC ] 
case Userjyiacro) 

$CHOM! Align [ ALIDATA ] $ma $CHOM ! ALIGN [ ^CORE ] 

case All_trans) 
case With_Templates) 

setvar noj rings TRUE 

setvar rbds %set_create{ %bonds { {rings (}} ) ) 
for i in $ schns 

setvar jbds %set_unpac]c ( $i ) 

# can set "side chain" bonds only if connecting bond is not cyclic 

if %set_and( "$rbds" "%bonds( %cat( %arg( 3 $jbds ) = \ 
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%arg( 1 $jbds ) ) )" ) 
setvar noj rings 
else 

CHOMIAllTrans $jbds 

endif 
endfor 
if $CHOM! Align [DEBUG] 

echo %prompt( INT 1 " " " " ) 
endif 

if %streql( $CHOM!Align[ ALICYC ] With^Teiuplates ) 
setvar f %open{ $CHOM!Align[ ALIDATA ] "r" ) 
setvar buff %read( $f ) 
setvar slnma %cat ( %sln{ $ma ) ) 
while $buff 

# each line of text should have pattern, SLN IDs for the 4 torsion atoms, 

# and a torsion value to set 

if %eq( %count( $buff ) 5 ) 

setvar torpat %search2d( $slnnia %arg ( I $buff } NoDup 0 y ) 
for t in $torpat 

MODIFY TORSION %sln_rgroup_sybid ( $ma $t %arg ( 2 $buff ) \ 
%arg( 3 $buff ) %arg ( 4 $buff ) ) %arg { 5 $buff ) >$nulldev 

endfor 
Q endif 
%g endwhile 
H %close{ $f ) 

endif 

. . m 

'-©idswitch 
errtif 

# dqt="a bump check? 

^ i %$ CHOM I Al ign [ BUMPS ] 

?yif %atoms ( {bumps {*,*)} ) 

Q setvar CHOMIBadRows %set_or( "$CHOM!BadRows" $rid ) 

rS echo Bad steric contacts in aligned conformer for Row $rid skipping 
J;^ goto nextl 
J^'^ndif 
■ enlfif 
ffi partial charges . . 

: switch $CHOM! Align [ CHARGE ] 
case None) 

case User__Macro) 

exec $ CHOM lAl ign [ CHARGEDATA ] $ma 

/ 7 

case ) 

CHARGE $ma COMPUTE $CHOM! Align [ CHARGE ] j >$nulldev 
endswitch 

# put conformer away 

: switch %substr( $213) 
case SYB) 

database add $ma r >$nulldev 

7: 7 

case ) 

%wcell( $rid $newc %cat { %cat ( %sln( $ma FULL CHARGE ) ) ) ) >$nulldev 

7^7 c 

i endswitch 



echo Built row $rid 
nextl: 
endf or 

if %stregl( %substr( $311) "A" } 

copy $incsav $mcore 

zap $mcsav 
endif 

if $CHOM! Align [BUMPS] 

TAILOR SET GENERAL biimps_contact_distance $tailor_bumps_save | ( 
endif 

# done, restore initial EXAMINE settings 
set CGQ_TIMEOUT $cgq_save 

setvar TAILOR ! EXAMINE ! AFFECT_SUBSET $AFFECT_SUBSET_save 
setvar TAILOR ! EXAMINE ! EXAMINE_TAILOR_MODE $EXAMINE_TAILOR_MODE_save 
setvar TAILOR ! EXAMINE I HIGHLIGHT__MSS $HIGHLIGHT_MSS_save 
setvar TAILOR I EXAMINE ! INFORM $INFORM_save 
setvar TAILOR 1 EXAMINE ! INPUT_MODE $ INPUT_MODE_save 

setvar TAILOR 1 EXAMINE I RELATE $RELATE_save 
setvar TAILOR ! EXAMINE ! SHOW_MOLECULE $SHOW_MOLECULE_save 
setulr TAILOR! EXAMINE! US ER_FUNCTION $USER_FUNCTION save 

TAIIiOR SET MAXIMIN2 LS_STEP_SIZE %arg { 1 $max_save ) \ 
H MAXIMUM_ITERATIONS %arg{ 2 $max_save ) | | 

# u^C^ate row and column information 
if tetreqK %substr( $213) NEW ) 

# msi!|e any new conformer column become the source of molecules 

"E&LE CONF %table( * COL COUNT ) 

qgOM ! UPDATE_ROW_SEL $ CHOM ! CID_Las t 

^Stvar CHOM!CID_Last %math( $CHOM! CID_Last + 1 ) 
else^^ 

diOM ! UPDATE_ROW_SEL 
endiiS 



# 

# Section I-B. Generates the topomeric conformation of the 3D model 
# 

©macro ALLTRANS chom 

#: assumes default molecule, takes argument atoms $1 and $2 

# where $1 is the JOINed atom of the core, $2 is the atom that 

# the rest of the substituent is to be trans to, 

# and $3 is the JOINed atom of the substituent 
4 starts from that atom and sets all side chains 
4 to a topomeric conformation 

localvar bds b bdset al a2 tmp sbonds sats rbond pbds torsion ringbonds doit 

# check input for legality 

setvar tmp %set_create { %atom_inf o ( $1 NEIGHBORS ) ) 
if %not( %eg( 2 %count { %set_unpack( %set_and{ \ 
"$tmp" %cat( $2 $3 ) ) ) ) ) ) 

echo Bad input to ALLTRANS (atoms $2 $3 not bonded to $1) 

return 
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# save key bonds 

setvar rbond %bonds ( %cat ( $3 "="$!) ) 
setvar sats %conn_atoms ( $3 $1 ) 
if %iiot { $sats ) 

# echo INTO substituent atoms found in ALLTRANS 
return 

endif 



setvar sats $3 $sats 

setvar sbonds %set_create ( %bonds ( \ 

%cat( "{to_ATOMS{" %set_create{$sats) "}}" )) } 

# define the other bonds that might need adjusting 

setvar bds %set_create( %bonds ( (*- {RINGS ()}) &<1> ) ) 
setvar bds %set_and( "$sbonds" "$bds" ) 
if %not{ $bds } 

return 
endif 

# discard bonds to primary atoms 

setvar mval %set_create ( %atoms { \ 

<H>+<o.2>+<F>+<I>+<Cl>+<Br>+<n. l>-f<LP>+<Du> ) ) 
s^var pds %set_create ( %bonds (. %cat { "{TO_ATOMS{" $mval ")}" ) ) ) 
sStvar bds %set_diff ( $bds $pds ) 
sejbvar ringbonds %set_create (%bonds ( {RINGS ()} ) ) 

# wa^ all the important bonds 
for-^'|) in %set_unpack ( $bds ) 

^tvar doit TRUE 

# ifjihis is the JOIN bond, already have some info 

St %eq( $b $rbond ) 
^^etvar aO $2 
J;:$etvar al $1 
J^Setvar a2 $3 

# stiAl need to be SURE we're not monovalent 

i|.f %or( "%eq( 1 % count ( %atom_inf o ( $al NEIGHBORS ) ) }" \ 
O "%eq( 1 %count( %atom^inf o { $a2 NEIGHBORS ) ) )" ) 
I* setvar doit 
endif 
else 

setvar bdat %bond_info( $b ORIGIN TARGET ) 
setvar al %arg ( 1 $bdat ) 
setvar a2 %arg( 2 $bdat ) 

if %or( "%eq( 1 % count ( %atom_info( $al NEIGHBORS ) ) )" \ 
"%eq{ 1 %count( %atom_inf o ( $a2 NEIGHBORS ) ) )" ) 
setvar doit 
endif 
if $doit 

# which end leads to root atom? if necessary flip al,a2 to make that one be al 

if %set_and{ "%set_create { %conn_atoms ( $"a2 $al ) ) " $1 ) 

setvar tmp $al 

setvar al $a2 

setvar a2 $tmp 
endif 

setvar aO %trans_path( $al $a2 $1 ) 
endif 
endif 
if $doit 

setvar a3 %trans_j)ath ( $a2 $al ) 



■ 



%^(^_unpack( "%set_and{ "$rin^fec 



switch %count( %i|^_unpack( "%set_and{ "$ringSinds" \ 

%set_create( % bonds { %cat ( $aO " = " $al $a2 " = " $a3 ))))"}) 
case 0) 

setvar torsion 180 

case 1) 

setvar torsion 90 

case 2) 

setvar torsion 60 

7 

endswitch 

modify torsion $aO $al $a2 $a3 $torsion >$nulldev 
endif 
endf or 



/* Beginning of section I-C, C code implementing the trans__path expression gener 
/*E4- : SYB_MGEN_CONN__BEST*/ 

*• int SYB_MGEN_CONN_BEST ( identifier, nargs, args, writer ) * 
^ Dick Cramer, Apr. 9, 1995 (written for SELECTOR use) * 

* Egression generator that returns the atoms attached to a given * 

* "-4 atom, excepting the second, in a prioritized order. * 

* IM there are two arguments, the ordering is by decreasing branch * 

*; m "size", where "size" is first any path with rings encountered, then 

*■ ntAber of attached atoms, then MW (paths in cycles end when an atom 

^ inC^nother path is encountered.) 

* |Q If three arguments, the atom that is returned is the one that 
begins the shortest path containing the atom referred to by the 

^ tMrd argument. If multiple such paths, ordering is same as for 

*■ tftj) arguments. 

* p Further prioritization of paths is by molecular weight, 
*; Iff and then by lowest X, Y, Z values. 

* Sif last argument is DEBUG, all paths are written to stdout. 

* U^er interface: 

* %trans_path( al a2 ( a3 ) (DEBUG) ) 

int SYB_MGEN_C0N1S[_BEST( identifier, nargs, args. Writer ) 

following arguments contain the text supplied to the %trans jpath ( ) 
: expression generator, and provide an avenue for producing text output. */ 
char *identif ier; 
int nargs ; 
char *args [] ; 
PFI Writer; 

{; 

4 define MAX_NP 8 

struct pathrec { 

int root, nrings, chosen, nats; 
float mw, xyz[3]; 
set_ptr path; 

} ' 

struct pathrec p [MAX_NP] ; 
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int retval, i, np, toroot, al, a2, a4, a, pnow, pdone, growing, 
final_pos, area_num, new__rings, nats, nuats, eleiti, ncycles, 
best, debug, ringclosed; 
List_Ptr atom_exp_list=NIL, SYB__EXPR_AN"ALYZE {) ; 
mol j)tr ml , to2, SYB_AREA_GET_MOLECULE ( ) ; 

atom_j)tr arec , SYB_ATOM_FIND_REC ( ) ; 
/* A set_ptr data structure is a Boolean set, first word containing 
its cardinality. */ 

set_ptr atom_setl==NIL, a2chk = NIL, nuls = NIL, cnats = NIL, 
nxcn = NIL, end_atoms = NIL, scratch = NIL, 
S YB_ATOM__FIND_SET ( ) , UTL_SET_CREATE ( ) ; 
char tempString [256] ; 

float get_path_mw ( ) , diff ; 

void getjpath_xyz {) ; 

retval = 0; 

/* Check the number of arguments */ 
if ( nargs < 2 j j nargs > 4 ) { 

UIMS2_WRITE_ERR0R ( 

"Error: %trans_path requires 2 to 4 arguments \n" ) ; 

return 0 ; 

: J } 

: U np = 0; 

w debug = ( !UTL_STR_CMP_NOCASE ( args [ nargs - 1], "DEBUG" )); 
^4 toroot = {debug ScSc nargs ==4) | | ( !' debug nargs === 3) ; 

/* P^SE THE INPUT */ 

/* ^St first atom */ 

CQf ( ! {atom_exp_list = SYB_EXPR_ANALYZE ( SYB_EXPRJ3STJVT0M_T0KEN, args [0] , 
a &:final_pos, &area_num ))) 
O goto error; 

m 

& ( ! (ml = SYB^?iLREA_GETJMOLECULE (area^num) ) ) 
goto cleanup; 

(i{atom_setl - SYB_ATOM_FIND_SET ( ml, atom_exp_list) ) ) 
S goto error; 
If ( atom_exp_list) 

SYB_EXPRJDELETE_RPN_LIST ( atom_exp_list) ; 
i atom_exp_list = (List_Ptr) NIL; 

if ( ! (1 == UTL_SET_CARDINALITY(atom_setl) ) ) { 
UIMS2_WRITE_ERR0R ( 

"Error: First argument must be only one atom\n"); 
^ goto error; 

} 

if (I (arec = SYB_ATOM_FIND_REC (ml, UTL_SET_NEXT (atom_setl, -1)) )) goto 
al = arec->recno; 
UTL_SET__DESTROY ( atom__setl ); 
atom_setl = NIL; 
/* get 2nd atom */ 

if { I (atom__exp_list = SYB_EXPR_ANALYZE { SYB_EXPR_GET_ATOM_TOKEN, args [1] , 
Scf inal_pos , &:area_num ). ) ) 
goto error; 

\ if { ! {m2 = SYB_AREAj3ETjy[0LECULE (area_num) ) ) 
goto cleanup; 

if (l(end_atoms = SYB_ATOM_FIND_SST ( m2, atom_exp_list) ) ) 
; goto error; 
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if ( atom_exp_list|^^ 

SYB_EXPR_DELETE_RPN_LIST ( atom_exp_list) ; 

atorn_exp_list = (List_Ptr) NIL; 

i if (ml 1= m2 ) { 

UIMS2_WRITE_ERR0R ( 

"Error: atoms must be in the same molecule\n" ) ; 
goto error; 

: } 

if(!(l == UTL_SET_CARDINALITY(end_atoms) ) ) { 
UIMS2_WRITE_ERR0R ( 

"Error: Second argiiment must be only one atoin\n") ; 
goto error; 

} 

if (!(arec = SYB_ATOM_FIND_REC (ml, UTL_SET_NEXT (end^atoms, -1)) )} goto er 
a2 = arec->recno; 

/* get 3rd atom */ 
if ( toroot) { 

if ( ! (atom_exp_list = SYB_EXPR_ANALYZE ( SYB_EXPR_GET_ATOM_TOKEN, args[2], 
&final_pos, &area__num ))) 
goto error; 

1e {! (in2 = SYB_AREA_GET_MOLECULE (area_num) ) ) 
'^4 goto cleanup; 

ii (!(atom_setl = SYB_ATOM_FIND_SET { m2 , atom_exp_list) ) ) 
; ?fi goto error; 
if ( atom_exp_list) 

□ SYB_EXPRJDELETE_Rr'N_LIST { atoin_exp_list) ; 

atom_exp_list = (List_Ptr) NIL; 

Sl (ml 1= m2 ) { 

UIMS2_WRITE_ERR0R( 
S "Error: atoms must be in the same molecule\n"); 

jLJ goto error; 

it ( ! (1 == UTL_SET_CARDINALITY{atom_setl) ) ) { 
UIMS2_-WRITE_ERR0R ( 

"Error: Second argument must be only one atom\n"); 
goto error; 

} 

if (!(arec = SYB_ATOM_FIND_REC (ml, UTL_SET_NEXT (atom_setl, -1)) )) goto er 
a4 = arec->recno; 

UTL_SET_DESTROY ( atom_setl ); 
atom setl = NIL; 

^ } 

/* GENERATE the paths */ 
!/* set up paths */ 



if ( 
if { 
if ( 
if { 
if ( 



(a2chk = UTL_SET_CREATE ( ml->max_atoms + 1 ) )) goto error; 
(nuls = UTL_SET_CREATE ( ml - >max_atoms + 1 ) )) goto error; 
(cnats = UTL_SET_CREATE ( ml- >max_atoms + 1 ) )) goto error; 
(nxcn = UTL_SET_CREATE ( ml- >max_atoms + 1 ) )) goto error; 
(scratch = UTL_SET_CREATE { ml- >max_atoms + 1 ) )) goto error; 



if ( ! syb_mgen_conn_att_atoms ( a2chk, ml, al )) goto error; 
if ( !UTL_SET_MEMBER( a2chk, a2 ) ) { 
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"Error: second argiiment atom is not bonded to first argument atom/\n") 
goto error; 

} 

UTL_SET_DELETE { a2chk, a2 ) ; 
a = -1; 
np = 0; 

while (np < MAX_NP && (a = UTL_SET_NEXT ( a2chk, a)) >= 0 ) { 

if ( ! (p [np] .path = UTL_SET_CREATE ( ml- >max_atoms + 1 ) )) goto error; 
p[np] .root = a; 
p[np] .nrings = 0; 

UTL_SET_INSERT { p[np].path, a ); 
np++; 

} 

grow the paths */ 
growing = TRUE; 
nats = 0 ; 
ncycles = 0; 
while (growing ) { 

nuats 0; 

ringclosed = FALSE; 

for (pnow = 0; pnow < np; pnow+-t- ) { 

UTL_SET_COPY_INPLACE { cnats, p [pnow] .path ); 
Q UTL_SET_CLEAR ( nxcn ) ; 
"5 elem = -l; 

acpuxnnulate this generation of attached atoms into nxcn */ 
C\ while { (elem = UTL_SET_NEXT ( cnats, elem)) >= 0 ) { 
ll UTL__SET_CLEAR( nuls ); 

•fl if ( ! syb_mgen_conn_att_atoms ( nuls, ml, elem )) return ( FALSE ); 

UTL_SET_DELETE ( nuls, al ); 

UTL_SETJDIFF_INPLACE ( nuls, end_atoms, nuls ); 

UTL_SET_OR_INPLACE ( nxcn, nuls, nxcn ); 
W UTL_SETJDIFF_INPLACE ( nxcn, p [pnow] .path, nxcn ); 

!i } 

y UTL_SET_OR_INPLACE ( p [pnow] .path, nxcn, p [pnow] .path ); 
rffiiove and mark ring closures when growing out */ 
0 if (Itoroot) for (pdone = 0; pdone < np; pdone++ ) if (pdone != pnow) { 
U UTL__SST_AND_INPLACE ( p [pnow] .path, p [pdone] .path, a2ch]c ); 

if ((new_rings = UTL_SET_CARDINALITY ( a.lchk. ))) { 
we have ring closure (s) */ 

p [pnow] .nrings += new_rings; 
p [pdone] .nrings += new_rings; 
ringclosed = TRUE; 

UTL__SET_OR_INPLACE ( end^atoms, a2ch]c, end_atoms ); 
if pdone < pnow, two branches are now same lengths, drop common atom from bot 
but if >, branches are different, and must avoid repeated closing */ 
if (pdone < pnow) { 

/* remove atom(s) in the previous branch because paths are really same length 

UTL_SET_DIFF_INPLACE ( p [pdone] .path, a2chk, p [pdone] .path ); 
^ UTL_SET_DIFF_INPLACE ( p [pnow] .path, a2chk, p [pnow] .path ); 

else { 

must identify and mark each atom in nxcn that is attached to a2chk atom */ 

elem = -1; 

while ( (elem = UTL_SET_NEXT ( a2chk, elem)) >= 0 ) { 
UTL_SET_CLEAR ( scratch ); 

if ( ! syb_mgen_conn_att_atoms ( scratch, ml, elem )) 

return ( FALSE ) ; 
UTL_SET_Am)_INPLACE ( scratch, nxcn, scratch ); 
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^ _ UTL_SET_OR_INPLACE { end_a^fcs, scratch, end_atoms ); 

/* done growing paths if no more atoms added to any path . . */ 
for (pdone = 0, nuats = 0; pdone < np; pdone++ ) 

nuats UTL_SET_CARDINALITY( p [pdone] .path ); 
if {nuats<=nats Iringclosed) growing = FALSE; 
nats = nuats; 

/* . . or looking for the 4th atom and found it . . */ 

if (toroot) for (pdone = 0; pdone < np; pdone++ ) 

if (UTL_SET_MEMBER( p [pdone] .path, a4 )) growing = FALSE; 
/* ..or after 100 atom layers out regardless */ 

ncycles++; 

if {ncycles >= 100) growing = FALSE; 

/* debugging */ 

if (debug) for (pdone = 0; pdone < np; pdone ++) { 

sprintf{ tempString, "Path %d {%d rings, from %d) : 
pdone+1, p [pdone] .nrings, p [pdone] . root ); 
UBS_OUTPUT_MESSAGE ( stdout, tempString ); 
C3 ashow( p [pdone] .path, ml ); 

/:* c^t^pute the path properties */ 

f'^ (pdone = 0; pdone < np; pdone + + ) { 
/*rr|nar]c as already chosen any path that can't be an answer */ 

-y p [pdone] .chosen = toroot ScSc !UTL_SET_MEMBER (p [pdone] .path, a4) ; 
Q p [pdone] .nats - UTL_SET^CARDINALITY ( p [pdone] .path ); 
fl p [pdone] .nrings = p [pdone] .nrings ? 1 : 0; 

p [pdone] .mw = 0.0; 
h p [pdone] .xyz [0] = p [pdone] .xyz [1] = p [pdone] .xyz [2] = 0.0; 
} m 

/* rilurn the best result */ 
b|it = 0; 

fcSt (pdone = 1; pdone < np; pdone++) { 
Jf if (toroot) { 

^ if (p [best] .chosen ScSc !p [pdone] . chosen) best = pdone; 
/* loolcing backward along chain, always grow away from more negative coord value 
if ( Ip [best] .chosen !p [pdone] . chosen) { 

get_path_xy2 ( p [pdone] . root , ml, p [pdone] .xyz ); 
get_pathj?cyz ( p [best] . root , ml, p[best].xyz ); 
for ( i = 0; i < 3; i++ ) { 

diff = p [pdone] .xyz [i] - p [best] .xyz [i] ; 
if (diff < -0.1) { 
best = pdone; 
break; 

} 

i if (diff > 0.1 ) break; 

else { 

if (p [pdone] .nrings && !p [best] .nrings) best = pdone; 
else if (p [pdone] .nats > p [best] .nats) best = pdone; 
else if (p [pdone] .nats == p [best] .nats) { 

p [pdone] .mw = get_path_mw ( p [pdone] .path, ml, p [pdone] .mw ); 

p[best].mw = get_path_mw( p [best] .path, ml, p[best].mw ); 
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^ if (p [pdone] •mw > p[best] .mw) best = ^^le; 
} ^ 

arec == SYB__ATOM_FI]SID_REC { ml, p [best], root ); 

sprintf (tempString, "%d", arec->id ) ; 
if ( ! (^Writer) (tempString) ) goto error; 

retval = TRUE; 
error: 
cleanup: 

if{ atoin_exp_list) 

SYB_EXPR_DELETE_RPN_LIST ( atom_exp_list) ; 
if (atoru_setl) 

UTL_SET_DESTROY {atom_setl) ; 
if (end_atoms) 

UTL_SET_DESTROY ( end_at oms ) ; 
if (a2chk) 

UTL_SET_DESTR0Y(a2chk) ; 
if (nuls) 

UTL_SET_DESTROY(nuls) ; 
if (nxcn) 

O UTL_SET_DESTROY{nxcn) ; 

dttCcnats) 

^4 UTL_SET_DESTROY(cnats) ; 

if (scratch) 

m UTL_SETJ3ESTR0Y (scratch) ; 

l^turn ( retval ) ; 

^ ■■ ■ . 

stat|F€ mt syb_mgen_conn__att_atoms ( aset, atid ) 

/* olft atoms attached to atm into aset */ 

/* W^KS STRUCTLY WITH RECNOS */ 

set_fir aset; 

mol Jtr m; 

int i€id; 

atom_ptr at, SYB_ATOM_FIND_ID () ; 
List__Ptr tohs, UTL_LIST_RETRIEVE_P ( ) ; 
. atom_ptr toh, SYB_ATOM_FIND_REC () ; 
acon_jptr connl; 
int nbytesl; 



at = SYB_ATOM_FIND_REC( m, atid' ); 

tohs = at- >conn_atom; 

while (tohs) { 

tohs = UTL_LIST_RETRIEVE_P ( tohs, Sconnl, &nbytesl) 
toh = SYB_ATOM_FIND_REC { m, connl- >tar5et ); 

^ UTL_SET_INSERT( aset, toh->recno ); 

return ( TRUE ) ; 

}; 

static float get_path_mw( aset, m, mw ) 

/* returns the total atomic weight of all atoms in aset */ 
setjptr aset; 
mol ptr m; 
float mw; 



{ 

int elem = -1; 
i float ans = 0,0; 

; atom jtr at , SYB_ATOM_FIND_REC { ) ; 
I f pt SYB JlTAB_JiJ'OMIC_WEIGHT ( ) ; 

- if (mw) return ( mw ); 
elem = -1; 

while ( (elem = UTL_SET_NEXT ( aset, elem)) >= 0 ) { 
at - SYB_ATOM_FIND_REC { m, elem ); 

ans += (float) SYB_ATAB_ATOMIC_WEIGHT ( at->type ); 
return ( ans ) ; 

} 

static void get_path__xyz ( aid, m, mw ) 
/* returns the xyz of the supplied atom */ 
int aid; 
mol_ptr m; 
float mw[3] ; 

{: 

int i ; 

; atom_j)tr at, SYB_ATOM_FIND_REC() ; 
iaD(mw [0] ) return; 

aei SYB_ATOM_FIMD_REC ( m, aid ); 
foW (i = 0; i < 3; i++) mw[i] = at->xyz[i]; 
, re-furn; 

stats: int ashow( aset, m ) 

/* fpr interactive debugging, shows a set's membership in terms of atom ID */ 
set_^r aset; 
mol tftr m; 

1: fj 

jPphar buff [1000] , *b; 
l=f.toinjitr at, SYB_ATOM_FIND_REC ( ) ; 
iJ-nt elem; 

*buff = '/O' ; 
b = buff; 
elem = -l; 

while ( (elem = UTL_SET_NEXT ( aset, elem)) >= 0 ) { 
at = SYB_ATOM_FIiro_REC { m, elem ); 
sprintf( b, " %d", at->id ); 
b = buff + strlen( buff ); 

sprint f ( b, "\n" ) ; 
^ UBS_OUTPUT_MESSAGE ( stdout, buff ); 

/* BEGINNING OF SUBROUTINES I-D. Calculation of attenuated fields */ 
/* +E : QSAR_FIELD_EVAL_RB_ATTEN ( ) */ 

/* ■ Jj 

/* int QSAR_FIELD_EVAL_RB_ATTEN( molp, stfldp, elfldp, regp, no_st, no_el, ctp ) 
/* */ 
/* Dick Cramer May 13, 1995 */ 

/* */ 



"Standard CoMFA" except that the contribution of any atom 
to the field falls off with an inverse power of its distance 
from a root atom, measured in NUMBER OF ROTATABLE BONDS! 

This means also that each individual atom's contribution 
has a similarly scaled upper bound, rather than checking 
the upper bound only for the sum over all atoms. 

*/ 

/* This procedure computes vdW 6-12 steric values at each point in region */ 
/* and the electrostatic interactions (initially assuming 1/r dielectric). */ 
/* */ 
/* NOTE:: initially ignoring space averaging, other user knobs. */ 
/* note: : assuming valid input here; error checking higher up I */ 
/* */ 

/* */ 
Input: */ 

molp - molecule pointer, molecule to place in region. */ 

/* stfldp - steric field pointer, where values will be placed. */ 

/:* elfldp - electrostatic field pointer, where values will be placed. */ 

/* regp - region pointer, locations where values are to be evaluated. */ 

no_st - flag to skip steric evaluations */ 

/* no_el - flag to skip electrostatic evaluations */ 

/* ctp - ComfaTopPtr, for d\immy/lp values */ 

/* Q */ 

/* Returns 0 on failure, 1 otherwise. */ 

/* M ' */ 

*************************** ^ 

/*+E|^SAR_FIELD_EVAL_RB_ATTEN() */ 

int ijAR_FIELD_EVAL_RB_ATTEN { molp, stfldp, elfldp, regp , no_st, no_el, ctp) 

mol iS:r molp; 

Fielp:Ptr stfldp, elfldp; 

RegipfiPt r regp ; 

int iVst, no_el ; 

Comf^opPtr ctp; 

{ V U , 
BoxPSfJr box; 

aitom^tr at , SYB_ATOM_FIND_ID { ) ; 

int fid, b, ix, iy, iz, nat, vol_avg, repulsive ; 

fpt *steric, *elect, SYB_ATAB_VDW_RADII ( ) ; 

fpt diff, dis, dis2, x, y, z, sum_steric, sum_elect ; 

fpt diss, disl2 , repuls_val, offs[9][3], atm_ste, atm_ele; 

fpt *charge, *ctemp, *coord, *ftemp, *wt, scale_vol_avg, atm_steric, atm_elect; 
int *atyp , *itemp, dohbd, dohba, ishbd, retval, dielectric , off, atid; 
static fpt hbond_scal; 

fpt hbond_A, hbond_B, *AtWts = NIL, *QSAR_FIELD_RB_WTS ( ) ; 

int *HAs, *HDs, *HAp, *HDp; /* sets would be more efficient but slower */ 

int do_steric, do_elect; 

set_ptr hdonor, SYB_HBOND_DONORS ( ) , pset = NIL, aset = NIL; 

ttdefine Q2KC 332.0 

#def ine MIN_SQ_DISTANCE 1 . Oe-4 

/* any atom within 10-2 Angstroms is hereby zapped ! 

this is about it: 10^6 / 10^-24 is close to overflow! */ 

ftemp = NIL; Ctemp = NIL; itemp = NIL; retval = FALSE; HAs = NIL; HDs = NIL- 
hdonor = NIL; 

/* for now, make root atom the one closest to 0,0,0 */ 
I for (nat = 1; nat <= molp->natoms; nat++) { 
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at - SYB_AT((p'IND_ID( molp, nat ); 

dis2 = at->xyz[0] * at->xy2:[0] + at->xyz[l] * at->xyz[l] + 

at->xyz[2] * at->xy2[2]; 
if (nat == 1 I I dis2 < dis) { 

dis = dis2 ; 

at id = nat; 



/* following is specific to topomeric fields */ 
if {i (AtWts = QSAR_FIELD_RB_WTS ( molp, atid ) )) goto cleanup; 

if ( !no_el} 
{dielectric = elf ldp->dielectric ; 
vol^avg = elf Idp- >vol_avg_type; 
scale_vol_avg = elf ldp->scale_vol_avg; 
repulsive = elf Idp- >repulsive; 

repuls_val=repexp [repulsive] ; elect = elf Idp -> f ield_value; } 
if ( !no_st) 
{ vo l_avg = s t f 1 dp - > vol_avg_t ype ; 
scale_vol__avg = stf ldp->scale_vol_avg; 
repulsive = stf ldp->repulsive; 

repuls_val-repexp [repulsive] ; steric = stfldp -> f ield_value; } 

;if Jl (ftemp = (fpt *) UTLjy[EMjyLL0C(3*sizeof (fpt) *molp->natoms) ) ) goto cleanup; 
:if i|! (ctemp = (fpt *) UTL JXEiyrALLOC ( sizedf (fpt) *molp- >natoms) ) ) goto cleanup; 
|if \{|! (itemp = (int *) UTLjy[EM_ALLOC ( sizeof (int) *molp- >natoms) ) ) goto cleanup; 
;if ffi! (HAS = (int *) UTLjy[EM_ALLOC ( sizeof (int) *molp- >natoms) ) ) goto cleanup; 
:if ;y (HDs = (int *) UTLjyiEiy^LLOC ( sizeof (int) *molp->natoms) ) ) goto cleanup; 
/* gA just tiiose H's wtiich are capable of Hbonding */ 
If E (hdonor = SYB_HBOND_DONORS ( molp, NIL ) )) goto cleanup; 

f or^^( coord=f temp , atyp=:itemp, cliarge^cterr^ , HAp=HAs , HDp=HDs , nat=l ; 
nat<=molp- >natoms;nat-f-+) 
{ If (NIL ==(at = SYB_ATOM_FIND_ID(inolp, nat) ) ) goto cleanup; 
|3:oord++ = at->xyz[0]; 

= at->xyz[l]; 
f3:oord++ = at->xyz[2]; 
t'atyp++ = at->type -1 ; 
*c]iarge++ = at->charge; 

*HAp++ = SYB_ATAB_HBOND_ACCEPT(at->type) ; 
*HDp++ = UTL_SST_MEMBER(lidonor, at->recno) ; 

;for (b=0; b<regp- >n_boxes ; b++) { 
; box = & regp->box_array[b] ; 
dohbd = (SYB_ATAB_ATOMIC_NUMBER( box- >atom_type) == 1) && 

(box->pt_charge == 1.0); 
■ do]iba = {SYB_JvrAB^ArOMIc:_NUMBER( box- >atorn_type ) == 8) ScSc 

(box->pt_charge -1,0) ; 
i if (dolibd I j dohba) { 

if (!TAILOR_STORE_IT_HERE( "TAILOR 1 FORCE_FIELD ! HBOND_RAD_SCALING" , 

&:]abond_scal, 1)) goto cleanup; 
hbond_A = pow{ libond^scal, 6.0 ); 
hbond_B = hbond_A * ]ibond__A; 

if (vol_avg) 

QSAR_FIELD_EVAL_GETOFF(offs, box- >stepsize,vol_avg, scale vol avg) ; 
; if ( !no_st ) ~ 

QSAR_FIELD_VDWTAB ( box -> atom_type, repuls^val, ctp- >du_lp__steric ); 
; for (iz=0, z=box->lo[2] ; iz < box->nstep [2] ; iz++, z += box->stepsize [2] ) 
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for (iy=0, y=box->^Jl] ; iy < box- >nstep [1] ; i^^, y += box- >stepsize [1] ) 
for (ix=0, x=box->lo[0] ; ix < box- >iistep [0] ; ix++, x += box- >stepsize [0] ) 

for { coord = ftemp, charge = ctemp, atyp = itemp, HAp==HAs, HDp=HDs, 

do_steric=TRUE, do_elect=TRUE, nat^O, sum^steric = STim_elect = 0.0, 
nat <molp - >natoms ; 
nat++, wt++) 

if { ( *atyp == DUMMY-1 || *atyp == LP-1 ) I ctp- >du_lp_elect ) 

*charge = 0.0; /* set charge to 0 since ignoring Du/lp */ 
if (!vol avg) /* the "normal" case */ 

: { 

dis2 = X - *coord++ ; 
dis2 *= dis2; 
diff = y - *coord++ ; 
: diff diff; 

dis2 += diff; 
diff = z - *coord++ ; 
diff diff; 
dis2 += diff; 

if ( !no_el && elf Idp- >zap_el==2 && do__elect) { 
dis = sqrt ( dis2 ); 

if ( dis < SYB__ATAB_VDW_RADII( *atyp+l ) ) { 
n® short circuits I */ 
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*elect-}-+ = 0.0; 
deselect = FALSE; 



if ( dis2 < MIN_SQ__DISTANCE ) { 
V if ( !no_st ) 

£3 /* if atom has no steric value, we don't care about 

MIN_SQ_DISTANCE since it has no contribution anyway */ 
if ( vdw_a[*atyp] != 0,0 vdw__b [*atyp] 1= 0.0 ) { 

/* set sterics to its max value at current grid pt. */ 
atm steric = (*wt) * stfldp->max value; 
} " 

if ( !no_el ScSc deselect) { 

if ( !no_st &Sc !do_steric ScSc elf ldp->zap_el ) { 
*elect + + = DAB_F_MISSING; 

else if ( *charge != 0.0 ) { 
if { *charge > tD . 0 ) 

atm_elect (*wt) * elf ldp->max_value; 
else atm_elect = ■(*wt) * -elf Idp- >max_value; 

if ( !do_elect !do_steric ) 

break; /* break out of loop since neither el. or st. 

need to be calculated for this grid point */ 

/* setting dis2 to 1 (an arbitrary no.) will prevent a zero 

divide in the sum_steric or sum elect calculations below */ 
dis2 - 1.0; " 

} 

if ( ! no_st ScEc do^steric ) { 
dis6 = dis2 * dis2 * dis2; 
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»is6 ; 



disl2= diss' 
if ( repulsive! 

disl2 = {repulsive==l) ? disl2 / dis2 : disl2 / dis2 / dis2; 
if (dohbd ScSc *HAp) 

atm_steric = hbond_B * vdw_b [*atyp] /disl2 - 
hbond_A * vdw__a [*atyp] /dis6 ; 
else if (dohba *HDp) 

atm_steric ^ hbond_B * vdw__b [*atyp] /disl2 - 
hbond_A * vdw_a [*atyp] /dis6 ; 

else 

atm_steric = vdw_b [*atyp] /disl2 - vdw_a [*atyp] /dis6 ; 
HAp++; HDp++; } 

atm_steric = atm_steric >" stf Idp- >niax_value ? stf ldp->max_value 

: atm_steric; 
atm__steric *= (*wt) ; 
if ( ! no__el do_elect ) { 
atxn_elect = *charge++ / 

( dielectric ? sqrt(dis2) : dis2 ) ; 
atm_elect = atm_elect > elf Idp- >max_value ? elf Idp- >max_value 
: atm_elect; 

atm_elect = atm___elect < - (elf Idp- >max_value) ? - (elfldp->max__value] 

: atm__elect; 
atm_elect *= {*wt) ; 
£3 sum elect += atm elect; 

a } " 

^ atyp++; 

^^J sum steric += atm steric; 

m ] ~ 

%4 else 

Q I for (of f=0;of f<9;off++) 

f } 

coord += 3, 
^ atyp ++ 
Ti charge + + 

y HAp ++ 

HDp +4 

} /* atom loop */ 
doneatoms : 

if ( do_steric | | deselect ) { 

if (vol__avg) { sum_elect /= 9,0; sum_steric /= 9 • 0 ; } 
if ( !no__el deselect ) 
{ Select = sum^elect * box-> pt_charge * Q2KC ; 

if ( *elect > elf ldp->max_value ) *elect - elf Idp- >max_value; 
else if ( *elect < - elf Idp- >max_value ) *elect = 
- elf Idp- >max_value; 

transform_field ( elf Idp- >max_value, elect, ctp) ; 
elect ++; 

} 

if ( !no_st ScSc do_steric ) 

{ *steric = sum_steric ; 

if ( *steric > stf Idp- >max_value) 

{ *steric = stf Idp- >max_value; 

if (!no_el elf Idp- >zap_el-=l ) *(elect-l) = DAB_F__MISSING; } 

transform_field{stf Idp- >max_value, steric, ctp) ; 
steric ++ ; } 

, } 

} /* points in box loop */ 
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} y* boxes loop 



retval = TRUE; 
cleanup : 

if ( itemp) UTL_MEM_FREE ( itemp) ; 
if { ftemp) UTL_MEM_FREE { f temp) ; 
if ( ctemp) UTL_MEM_FREE ( ctemp) ; 
if (HAS) UTL_MEM_FREE ( HAS ); 
if (HDs) UTL_MEM_FREE ( HDs ); 
if (hdonor) UTL_SET_DESTROY ( hdonor ); 
if (AtWts) UTL_MEM_FREE ( AtWtS ); 
if (pset) UTL_MEM_FREE ( pset ); 
if (aset) UTL_MEM_FREE ( aset ); 
return retval; 

#undef Q2KC 

#undef MIN_SQ_DISTANCE 

/* - 

static fpt *QSAR_FIELD_RB_WTS ( molp, rootid ) 
/* generates rotational -bond wts for each atom */ 
mol_ptr molp; 
int rootid; 

{ n 

/* ggeudo code for FIELD_RB_WTS ( ) 
wflils saw new atoms 

,:|uncover atoms that stopped last shell growth 
^;jgrow next "rotational shell" 
..Jwhile adding to shell 
J;f for each atom in shell 
^ get neighbors not seen 

for each neighbor 



fit 



if bond is rotatable (acyclic, >1 attached atom, not =,am,#) 
coyer all other atoms attached to atom for this shell 



add it to shell 



^t *ansr = NIL, *vals = NIL, factor, nowfact = 1.0; 

found, aggcount, at id, aggid, loop, size; 
set_ptr aggats = NIL, allats = NIL, nuls = NIL, endatms = NIL, end cands 

atom_ptr root, SYB_ATOM_FIND_REC ( ) , at, atrec ; 

bond_ptr b, SYB_BOND_FIND_REC ( ) ; 
List_Ptr toats, UTL_LIST_RETRIEVE_P ( ) ; 
acon_ptr cptr; 
char tempS t ring [200] ; 

ashow ( ) , qsar_f ield_attached_atoms ( ) ; 

ii.Uy^i^ = (^Pt UTL_MEM_ALLOC ( sizeof ( fpt) *molp- >natoms) ) ) return{ NI 
If (!UIMS2_VAR_GET_T0KEN( " TAILOR !COMFA!AGGREG DESCALE" reuurn^Wi 

S:f actor ) ) return ( NIL ) ; ~ ' 

if (! (allats = UTL_SET_CREATE ( molp->max atoms + 1 ) )) goto cleanup; 
If aggats = UTL SET_CREATE ( molp- >max:atoms h- 1 ) ) ) goto cleanup 
If nuls = UTL_SET_CREATE ( molp- >max_atoms + 1 ) )) goEo cleanup; 
If ! endatms = UTL_SET_CREATE ( molp->max atoms + 1 ) )) goto cleanup- 
If ! end_cands = UTL_SET_CREATE ( molp->max atoms + 1 ) )) goto cleanup- 

im^ii^.^.f^-''™-''''^-^^^^ ^ ^) 9°to cleanup; 

UTL_SET_INSERT( aggats, root->recno ); 

UTL_SET_INSERT( allats, root-> recno ); 

aggcount = loop = l; 
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f ^ 



1 while (TRUE) { 

while (TRUlT { 
aggid = -1; 

while ((aggid = UTL_SET_NEXT ( allats, aggid )) >= 0 ) { 
UTL_SET_CLEAR ( nuls ); 

qsar_f ield_attached_atoms ( nuls, molp, aggid ); 
UTL_SET_DIFF_INPLACE ( nuls, allats, nuls ); 
UTL_SET_DIFF_INPLACE ( nuls, endatms, nuls ); 
/* identifying any atoms that terminate this aggregate */ 

atid = -1; 

while ((atid = UTL_SET_NEXT ( nuls, atid )) >= 0 ) { 

if (!( at = SYB_ATOM_FIND_REC ( molp, atid ) )) goto cleanup; 
skipping monovalent atoms */ 

if (at->nbond > 1) { 
find bond record that attaches to aggid */ 

toats = at->conn_atom; 
found = FALSE; 
while (toats && I found ) { 

toats = UTL_LIST_RETRIEVE_P ( toats, &cptr, &size ); 
^ found = (cptr-> target == aggid ); 

if (! found) goto cleanup; 

b = SYB_BOND_FIND_REC (molp, cptr- >bond_rec) ; 
Q if ( ! {b->status & BOND_V_IRING) && !(b->status & BOND_V_ERI 

5 && (b->type == SYB_BTABJMNEM_TO_TYPE ( " 1 " ) ) J { 

haVe an end -of -aggregate atom, mark as end atoms all other attached atoms */ 
sA UTI^SEI^CLEAR ( end_cands ) ; 

m qsar_f ield_attached_atoms ( end_cands, molp, at->recno ); 

;j UTL_SET_DELETE ( end_cands, aggid ); 

r| ^ UTL_SET^OR_INPLACE ( endatms, end_cands, endatms ); 

t 

K UTL_SET_OR_INPLACE ( aggats, nuls, aggats ); 

-.s } 

g if (UTL_SET_CARDINALITY ( aggats ) <= aggcount ) break; 

in aggcount = UTL_SET_CASDINALITY ( aggats ); 

p y UTL_SET_OR_INPLACE( allats, aggats, allats ); 

debugging stuff . . */ 

sprintf( tempString, "Aggregate %d (weight = %f ) : « , loop, nowfact )• 
UBS_OUTPUT_MESSAGE ( stdout, tempString ); 
ashow( aggats, molp ) ; 

if no atoms added, we are done! */ 

if (UTL_SET_EMPTY( aggats )) break; 
record scaling factor for atoms in this aggregate */ 
atid = -1; 

while ((atid = UTL_SET_NEXT ( aggats, atid )) >= 0 ) { 

if (Katrec = SYB_ATOM_FIND_REC ( molp, atid ))) goto cleanup; 
^ vals[ (atrec->id) -1 ] = nowfact; 

UTL_SET_OR_INPLACE ( allats, aggats, allats ); 
UTL_SET_CLEAR ( aggats ) ; 
UTL_SET_CLEAR ( endatms ); 
aggcount = 0; 
nowfact *= factor; 
loop++; 
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ansr = vals; 
cleanup : 

if (aggats) UTL_SET_DESTROY( aggats ); 
if (allats) UTL__SET_DESTROY( allats ); 
if (endatms) UTL_S ETUDES TROY ( endatms ); 
if (end^cands) UTL_SET_DESTROY ( end_cands ); 
if (nuls) UTL_SET_DESTROY ( nuls ); 
^ return { ansr ) ; 

static void qsar_f ield__attached_atoms ( aset, m, atid ) 

/* ors atoms attached to atm into aset */ 

/* WORKS STRUCTLY WITH RECNOS */ 

set_ptr aset; 

mol_ptr m; 

int atid; 

{■ 

atoin_ptr at, SYB_ATOM_FIND_ID ( ) ; 
List_Ptr tohs, UTL_LIST_RETRIEVE_P ( ) ; 
atom__ptr toh, SyB_ATOM_FIND_REC ( ) ; 
acon_ptr connl; 
int nbytesl; 

II = SYB_ATOM_FIND_REC { m, atid ) ; 
t%hs = at- >conn_atom; 
w|ile (tohs) { 

^ tohs = UTL_LIST_RETRIEVE_P ( tohs, &connl, &nbytesl) ; 
r; toh = SYB_ATOM_FIND_REC ( m, connl- >target )• 
^ UTL_SET_INSERT( aset, toh->recno ); 

return; 



staMc void ashow( aset, m ) 

/*,*55 interactive debugging, shows a set's membership in terms of atom ID * / 

S6c_jptr aset; ' 
mol^tr m; 



{ 



char buff [1000] , *b; 

atom_ptr at, SYB_ATOM_PIND_REC ( ) ; 
int elem; 

*buf f = ' /O' ; 
b = buff; 
elem = -1; 

while ( (elem = UTL_SET_NEXT ( aset, elem)) >= o ) f 
at = SYB_ATOM_FIND_RECC m, elem ); 
sprintf( b, " %d", at->id ); 

^ b = buff + strlen( buff ); 

sprint f ( b, "\n" ) ; 

UBS_OUTPUT_MESSAGE { stdout, buff ); 
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# Section II -A. SPL invoked shell for computing the diagonal defining the 

J "best" triangle, e.g., the one with the highest density of points bel 

@expression_generator LRT_FAST 

# Usage: 

# lrt_fast rows descriptor_cols bio_col [pis flags like scaling in quotes] 

# rows (*) - rows to take 

# descriptor_cols - which columns are the neighborhood metrics 

# bio_col - which column has the bio (probably log bio) data 

# [...] - if need to SCAL NONE or anything like that, do it here 
# 

# returns a line of the form 

# 3.09691 / 0.000546509 = 5666.71 - 496 : 496 :: 15.6981 : 15.6989 

# ^ max bio difference 

# ^ optimal distance division for max bio 

^ slope 

# ^number in the Irt 
^ ^total number 

I ^area in the Irt 

. , ^ ^total area 

ft Significance is related to whether ratio of numbers is 

# Ciiuch above ratio of areas. 

# m 

gloaalvar SAMPLS_IN_PROGRESS DONE_CHECKED_OUT 
looalvar hold distname rows cols bio 

111 

se5i|ar rows %promptif ( ROW_EXP "*" "Rows to use in Irt") 

set^ar cols %promptif ( "$2 " COL_EXP "COMFA*" "Columns of mol descriptors") 

setijar bio %promptif ( "$3 " COL_EXP "LOGBIO" "Col;amn of bio data") 

sefc^ar hold SAMPLS_IN_PROGRESS 
set||ar SAMPLS_IN_PROGRESS $bio 

setjlar distname TAILOR! HIER!DIST_FNAME 
set^r TAILOR !HIER!DIST_FNAME lrt_fort.3 

# h^Fe the information is computed and written to a file 

# whose name is passed in via a TAILOR value 
QSAR ANA DO I >$NULLDEV $rows $COls HIER $4 | 

:setvar SAMPLS_IN_PROGRESS $hold 

isetvar TAILOR! HIER !DIST_FNAME $distname 

# contents of the file are returned to the caller " 
setvar hold %system("cat Irt fort.3") 

%return( "$hold" ) ~ 



# 

#. ^^^^^°J^^^^S^^SPL^script for computing the significance of the distribution 
@expression_generator dochi 

# computes the chi- square statistic for the number of points below 

# the diagonal, null hyptheses being the area fraction of the total 



TO be called as: %dochi ( %lrt_f ast ( ) ), i.e., its inputs 
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# are exactly the ^^put of %lrt_fast as descr:IR in the lrt_fast header. 
# 

: setvar expected %math( $9 * $11 / $13 ) 
setvar sq %math( $7 - $expected ) 
setvar sg %math( $sq * $sq / $expected ) 
% re turn ( $sg ) 



/:* Section II-C. Computes the best diagonal in the "virtual graph" of biological 
distances vs property differences. */ 

int QSHELL_HIER_LRT (table, biocol , dmat , nrow, order, Imsg) 
char *table; 

int biocol, /* column in MSS with biological data */ 

nrow, /* dimension of dmat and order */ 

*order; /* array of row IDs to consider */ 
fpt *dmat; /* distance matrix for property distances */ 
char * Imsg; /* file name for results */ 

fpt *p, *q/ fabsO, bmax; 
int i,j, count, status_array; 
char *fpt__colname; 
FILg3*out, *UTL_FILE_FOPEN() ; 

; /^'Jneed to get the bio values 

;;^in the n^2 we can repack into n(n- 1)/2 then add the n bio values 
Ifiand finish with the bio distances */ 

C3No error handling. Better be data in those rows! 

ifojt|3(count=0, i=0; i<nrow; i-f-f) 
fqlB ( j=0; j<i; j+ + ) 
finat [count ++] = dmat[i*nrow + j]; 

;q =^^^ = dmat + ( (nrow-1) * nrow) / 2; 

:TBi^ccESS_lNDEXjro_cOLNAME( table ^ biocol-1, &fpt colname) • 
TBLl_GRAB_INIT_FPTS (table, 1, &fpt_colname ); 
for ( i=0; i<nrow;i+-h, p++) 

TBL_GRAB_GET_FPTS_INV(order[i] -1, &status array, p) ; 
;TBL_GRAB_COMPLETE_FPTS ( ) ; ~ 

bmax = 0 .0 ; 

for (count=0, i=0; i<nrow; i++) 
for (j=0; j<i; count+-f) 

if ( (p[count] - fabs(q[i] - q[j])) > bmax) bmax = p[count]; 

:out = UTL_FILE_FOPEN(lmsg, "w") ; 
:QSHELL_HIER_DO_LRT ( ou t , count , dmat , p , bmax) ; 

iUTL__FILE_FCLOSE(out) ; 
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)(^^T( out, index, xsort, ysoll^ 



int QSHELL_HIER_D(^P^T( out, index, xsort, ysof^f bmax ) 
FILE *out; 
f pt *xsort , *ysort , bmax; 
int index; 

{ : : 

int *order; count, j, i, bad; 
int bestN, best I; 
fpt den,bestDen; 

#define CUTOFF ( bmax * ( xsort [order [i] ] / xsort [order [j ] ] } ) 

if (! (order = (int *) UTL__MEM_ALLOC ( index *si2eof(int )))) return 0; 
for (i = 0;i<index; i + + ) order [i] ==i; 
bestN = besti = bad = 0; 
bestDen = 0.0; 

;fpt_heapsort (index, xsort, order) ; 

Jfor (j=0;count=0, bad=0, j<index ;j++) 

J if (xsort [order [j]] 0.0) continue; 
for (i=0; i<=j ; i++) 

O if (ysort [order [i] ] <= CUTOFF) count++; 
; ;,f^ else t)ad+ + ; 

i| /* loop over all d <= this distance */ 

i {den - count/ bmax / xsort [order [j ] ] *2.0) > bestDen) 
m (bestDen = den; bestI = j; bestN = index - bad;} 
} /* loop over all distances */ 

deiS2= bmax * xsort [order [index- 1] ] ; 

;spfSntf (msg, "%g / %g = %g - %d : %d : : %g : %g\n", 

bmax, xsort [order [bestI] ] , bmax/xsort [order [best I] ] , '''' 
- U-' bestN, index, den-xsort [order [bestI] ] *bmax/2.0, den) ; 
^ UBagOUTPUTjyiESSAGE (out , msg) ; . 
^ ^UTlgiyiEiy^FREE (order) ; 
xediirn 1; 



A-28 



/* n is number of elements 
■ arrin is array of floats to be sorted 
indx is array of ints initially 0,.,n-l 

*/ 

int f pt_heapsort (n, arrin, indx) 

int n; 

fpt *arrin; 

int *indx; 

{| ■ 

int 1, ir, indxt, i, j; 
fpt q; 

1 = n/2 ; 
ir = n - 1 ; 



while (TRUE) /* the "10" loop */ 
{ 

if (1>0) { indxt = indx[--l]; q = arrin [indxt] ; } 
else 

{ 

indxt - indx[ir] ; q = arrin [indxt] ; 
indx[ir--] = indx[0] ; 
„ if ( ir == 0 ) 

Jj| { indx[0] = indxt; return 1; } /* <=== Only way out ! */ 

■ i'i 1; 
i.3= 1; 

jr'r 1+1 +1; 

w|^le (j <» ir) /* the "20" loop */ 

^"•Cf ( {j<ir) && (arrin[indx[j] ] < arrin [indx [j+l] ] ) ) j++ ; 

< arrin [indx[j] ] ) / indx[i] = indx[j]; i = j; j = j+j+1; } 
^J=else { j = ir+l; } 

^ ifex[i] = indxt; 
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/* SECTION III-A. Decorations for all non-standa m ata structures referenced 
in the C code function?^ shown in Sections I and II. */ 




i 
! 
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16 



Y ******** ************************* 



/* 
/* 

/;* 

/* 
/* 
/* 
/* 
/* 

/:* 

^-kic-k -kic ********************************************************************/ 



Molecule and Supporting Structure Definitions 

John McAlister 09 -Aug- 1985 

This file contains the definitions for the molecular data struc- 
tures required within SYBYL. The contents of this file are des- 
described in detail in the document "SYBYL Molecular Data Struc- 
tures" . 



*/ 
*/ 
*/ 
*/ 
*/ 
*/ 
*/ 

V 
*/ 



Define the molecule descriptor template 
typedef struct molecule_struct { 



char 
132 

List_Ptr 

132 

char 

stamp 

stamp 

int 

int 

List_Ptr 

int 

int 

List__Ptr 

iiit: 

int 

List^Ptr 
List_Ptr 
int 
int 

List_Ptr 

int 

int 

List_Ptr 
int 
fpt 
fpt 

List Ptr 



*name; 

type ; 

diet ; 

status; 
^comment ; 

cre_time; 

mod_time; 

max__props; 

nprops ; 

props ; 

max__f eats; 

nf eats; 

feats; 

max_subst ; 

nsubst ; 

subst ; 



/* 
/* 
/* 
/* 
/* 
/* 

/* 
/* 

/* 



pointer to molecule name 
molecule type 

list of dictionaries used with molecule 
molecule status 

pointer to comment for molecule 
creation time/user/version stamp 
modification time/user/version stamp 
maximum properties currently allocated 
number of molecular properties 
/* pointer to list of properties 
/* maximum features currently allocated 
/* number of molecular features 
/* pointer to list of molecular features 
/* maximum substructures currently allocated*/ 



*/ 

*/ 
*/ 
*/ 
*/ 

*/ 
*/ 

*/ 

*/ 

*/ 

*/ 

*/ 

*/ 

*/ 



/* number of substructures in molecule 
/* pointer to list of substructures 
subst_roots; /* pointer to list of root subst offsets 
max_atoms; /* maximum atoms currently allocated 
number of atoms in molecule 
pointer to atom array segment list 
maximum bonds currently allocated 
number of bonds in molecule 
pointer to bond array segment list 
type of atomic charges, if present 
translation vector for molecule 
rotation matrix for molecule 
pointer to list of associated data 



natoms; 
atoms ; 
max_bonds ; 
nbonds ; 
bonds ; 
charges; 
vector [3] ; 
matrix [9] ; 
assoc data; 



/* 
/* 

/* 
/* 

/* 

/* 

/* 

/* 



descriptors 



} molecule, *mol_ptr; 

* ATOM DEFINITION **************************** 



*/ 
*/ 
*/ 
*/ 
*/ 
*/ 
*/ 
*/ 
*/ 
*/ 
*/ 
*/ 
*/ 
*/ 



/;************************ 

/* Define the atom entry 
typedef struct atom s 



record 
truct 



char 


*name; 


/* 


int 


type; 


/* 


132 


status; 


/* 


int 


recno; 


/* 


int 


id; 


/* 


int 


1 ink ; 


/* 


int 


subst; 


/* 


List_Ptr 


property; 


/* 


List_Ptr 


feature; 


/* 






/* 


int 


nbond; 


/* 



template 

atom name 
atom type 
atom status 

cumulative atom record number 

atom id (logical atom number) 

link to next atom record 

offset to substructure containing atom 

pointer to list of properties for atom 

pointer to list of features including 

this atom 
number of bonds involving this atom 



***/ 

*/ 



*/ 
*/ 
*/ 
*/ 
*/ 
*/ 
*/ 
*/ 
*/ 
*/ 
*/ 
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L±st_Ptr conn^Pfom; /* pointer to list of^fended atoms 

fpt xyz[3]; /* coordinates of atom 

fpt charge; /* point charge on atom 

} atom, *atom_ptr; 

Define the atom array segment descriptor template 

: typedef Struct atom_seg_struct { 

atom_ptr seg_head; /* pointer to head of atom array segment 

I mol_ptr molecule; /* pointer to molecule containing atom seg 
int max__atom; /* maximum number of atom records in seg 

int natom; /* number of filled atom records in seg 

int used_atom; /* offset to first filled record in segment 

int free_atom; /* offset to first free record in segment 

} atom^seg, *asegjptr; 

;* Define the bond specifier records pointed to by the atom records 
typedef struct atom__conn__struct { 

int target; /* offset to target atom 

int bond_rec; /* offset to bond descriptor record 

} atom_conn, *acon_ptr; 
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pit *********************** 

7;* Define the bond entry 
typedef struct bond_s 



* BOND DEFINITION *******************************/ 

*/ 



record 
truct 



template 



int 


type; 


/* 


bond type 


132 


status; 


/* 


bond status 


Int 


recno; 


/* 


cumulative bond record number 


Int 


id; 


/* 


bond id (logical bond number) 


int 


link; 


/* 


link to empty bond record 


List_Ptr 


property; 


/* 


pointer to bond property list 


List_Ptr 


feature; 


/* 


pointer to list of features including 






/* 


this bond 


int 


o_subst; 


/* 


offset to origin atom substructure 


int 


origin; 


/* 


offset to atom at bond origin 


int 


t^subst; 


/* 


offset to target atom substructure 


int 


target; 


/* 


offset to atom at bond destination 


} bond. 


*bond_ptr; 







/;* Define the bond array segment descriptor template 
typedef struct bond_seg_struct { 



bond_ptr 

mol_jptr 

int 

int 
ij int 
•-.,1 int 



seg_head; 
molecule; 
max_bond; 
nbond; 
used_bond; 
f ree bond; 



} bond_seg, *bseg_ptr; 



/* 
/* 
/* 
/* 
/* 
/* 



pointer to head of bond array segment 
pointer to molecule containing bond seg 
maximum number of bonds in segment 
number of filled bond records in seg 
offset to first filled record in segment 
offset to first free record in segment 



*/ 
*/ 
*/ 
*/ 
*/ 
*/ 
*/ 
*/ 
*/ 
*/ 
*/ 
*/ 



*/ 
*/ 
*/ 
*/ 
*/ 
*/ 
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/ ★ * *************************************** 



/* ===== comfa.h ====== */ 

/* Regions are the set of points at which energy evaluations are made */ 

/* the CoMFA method of QSAR. A region is defined as the union */ 

h of a set of 3D boxes (which may be a single point in the */ 

/* limit) and their associated attributes. Attributes needed for */ 

/* CoMFA purposes are outlined below. */ 

/* */ 

#if ndef QSAR_COMFA_DEFINITIONS 

tdefine QSAR_COMFA_DEFINITIONS 1 

#include " ta__types .h" 

#def ine DUMMY 26 /* dummy atom id */ 

#def ine LP 20 /* lone pair atom id */ 



typedef enum { 
FDENGY_UNKNOTOT , 
FDENGY_ELECT, 
FDENGY_STERIC, 
FDENGY_HOMO, 
FD&FGY__LUMO, 
. DOtfk_ELECT, 

do®k_sta_nohb, 
^ dobk:_sta_hbd, 

DOClK_STA_HBA, 

DOCaC_STB_NOHB, 
: D0C3C_STB_HBD, 
r DO^_STB_HBA } FldEngyTyp; 

type^Jf enuiti { 
j FDj§_ORIGINAL, 

FDii)_FFIT, 
: FD|i)_XTERN, 

FDti_FUNC, 
• FDtt_USER, 

FDiD_USR_AVG, 

FDHD_DOCK, 

FDHD_AVG, 

FDHD_SIG, 

FDHD_MAX, 

FDHD_MIN, 

FDHD_COEFF, 

FDHD_AVG_X, 

FDHD_SIG_X, 

FDHD_FLD_X, 
r FDHD_RA]SrGE, 

FDHD_PLS_XWT, 
: FDHD_PLS_XLOAD , 
; FDHD_FAC_LOAD, 
: FDHD_FAC_COMM, 
: FDHD_FAC_ROTLOAD, 
i FDHD_SIMCA^LOAD, 
j FDHD_SIMCAjy[ODEL, 

: fdhd_simca DISCRIM, 

FDHD_HBD J FldHowTyp; 
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typedef struct { 
: fpt lo[3] , 
hi [3], 
steps ize [3 
int nstep [3] , 
n; 

int atom^type; 
■ fpt pt_charge; 
fpt *weight ; 
int avg_type ; 
fpt avg_scale; 
int arb, 
*parb; 

} 



/* 
; /* 
/* 
/* 
/* 
/* 
/* 
/* 

Box, 



axis 
II 



corner with lowest values for each 

" " hi -est " " " 

increment between points 

derived as 1 + (hi-lo + epsilon) / stepsize 

n = product of nstep [i] 
SYBYL atom type, for steric energy computation 
elemental charge at point, for electrostatics 
weight [n] is applied in all computations, e.g=l 
box of 'scale', sphere, sphere x vdw, 
scale whose meaning derived from avg_type 
arbitrary int for later use 
" pointer " " 

*BoxPtr ; 



typedef struct { 
char * filename ; 
int n_boxes; 
int n_points ; 

: BoxPtr box_array; 

I int n_ref s ; 

: long when_made; 



/* 
/* 
/* 
/* 



name of the region's file (if any) 
niimber of boxes which make up the region 
number of points in this region altogether 
box_array [n_regions] , each one a Box 

number of CURRENT references to this memory 
creation stamp 



} Region, *RegionPtr ; 



V 

*/ 
*/ 
*/ 
*/ 

V 
V 
V 



type^f struct { 
: chM* *reg_name; 
i char * f 1 d_name ; 
y Re§lonPtr reference; 

FiajlngyTyp fid; 
: inC3num_avgd; 

inSg curr_iter; 

char *mol id; 



mlgn^pomts ; 
in|izap__el; 
f pf ^^max_value ; 
fp|2 *f ield__value; 
in£"^"n_refs ; 
long when_made; / 
int vol_avg_type; 
fpt scale__vol_avg; 
int dielectric; 
int repul s ive ; 
FldHowTyp how_made; 
} Field, *FieldPtr 



/* name of the region's file (if any) 
/* name of this field's file (if any) 
/* the region referenced by this field 

/* what type of field is referenced here 
/* number of fields averaged into this one 
/* number of iterations in current field fit run 
/* unspecified molecule id, 

e.g. dbname/molname/al ignname 

/* number of points in associated region 
/* whether electrostatics are MISSING when>max_st 
/* largest permitted absolute value of energy 
/* values at each point of the field 
/* number of CURRENT references to this memory 
/* creation stamp 

/* added these 4 items 1/30/89 DEP */ 



/* perry's way = 1 or old way = 0 */ 
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/* molecule dependent information solicited by QSAR table operations, 
passed into COMFA column field evaluations */ 

typedef struct { 

boolean already_f ield; /* whether a field name exists (otherwise alignment) 



:char *some_name; 
char *steric_name; 
;char *elect_name; 
FieldPtr sfldj); 
iFieldPtr ef ld__p; 



}: ComfaMol, *Comf aMolPtr ; 



/* name of alignment; Nil align==use as is ( 

/* name of steric field (if applicable) 

/* name of electrostatic field (if applicable) 
/* points to steric field in memory (when there) 
/* points to elect, field in memory (when there) 



) 



/■* molecule- independent information for CoMFA evaluations */ 

typedef struct { 
.int vol_avg 
fpt vol_scale 
int f ld_types 



;fpt steric__max 
int repulsive 
f pt'Q elect_max 
int%g dielectric 
intvi elect out 



/* case for volume averaging: 0, 1, 2=none, box, sphere (0) */ 
/* scale for volume averaging (1.0) 

/* case for what fields: 0 , 1, 2-both, steric, elect , (0) 



maximum steric energy (30) 

steric repulsive exponent - 12, 10, or 8 (12) 
maximum electrostatic energy (3 0) 
case for dielectric (AS FORCE FIELD TAILOR) 
case to drop elect inside steric max: 0,1=T,F 



(1) 



•chap *region_name; /* name of region used in the CoMFA computations 



FiegdPtr sweight_f Id; 
;Fie|2iPtr eweight_fld; 

Fl&owTyp how__done; 
; inig du_lp_steric; 

ing du_lp_elect; 



/* 
/* 

/* 

/* 



in£: sparel; /* 
infc'^ spare2 ; /* 
} CoiftaTop, *ComfaTopPtr; 



points to MEMORY field for weighting steric PLS 
points to MEMORY field for weighting elect. PLS 

/* perry's way = 1 or old way = O */ 
include dummies and lone pairs in steric field 
calculations */ 

include dummies and lone pairs in electrostatic 
field calculations */ 

As of 6.1comfa , this is TAILOR I COMFA! TRANSFORM*/ 
INDICATOR SCALE among other things ^ 



*/ 
*/ 

*/ 
*/ 
*/ 
*/ 
*/ 

*/ 

*/ 
*/ 



/ 



#endif 
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Section III-B. Functional descriptions of external procedures. 
(Routines that simply return dynamic memory to the heap are not 
described.) 

BOND_V_ERING - TRUE if bond is in an external ring. 

BOND_V_IRING - TRUE if bond is in an internal (simple) ring. 

QSAR_FIELD_EVAL_GETOFF - provides coordinates for field 
computation when "volume averaging" is being done. 

QSAR_FIELD„VDWTAB - returns steric parameters for the 
computation of the field contribution from the probe atom and each 
of the molecule atoms. 

SYB„AREA_GET_MOLECULE - returns the internal representation of 
the molecule in some area or "container", if such exists. 

SYB_ATAB_ATOMIC_NUMBER - returns the atomic number of the 
specified atom type. 

SYB_ATAB_ATOMIC_WEIGHT - returns the atomic weight of the 
specified atom type. 

SYB_ATAB_HBOND_ACCEPT - retums TRUE if the specified atomic 
type is a hydrogen-bond accepting atom. 

SYB_ATAB„VDW_RADII - returns the atomic radius of the specified 
atomic type, 

SYB_ATOM_FIND_ID - returns the internal representation of an atom 
referenced by its atom ID number (Atom IDs are guaranteed to be 
continuous but the ID of any single atom may change as atoms are 
added or deleted.) 

SYB_ATOM_FIND_REC - returns the internal representation of an 
atom referenced by its record ID number. <Atom record IDs are 
invariant but there may be "holes" in their sequence such that the 
largest record ID may be greater than the number of atoms.) 

SYB„ATOM_FIND„SET - returns the bitset of atoms corresponding to 
a list of atoms. 
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SYB_BOND_FIND_REC - returns the internal representation of a bond 
referenced by its (invariant) record ID number. 

SYB_BTAB_MNEM_TO_TYPE - converts an ASCII representation of a 
bond type to its internal representation. 

SYB_EXPR_ANALYZE - parses a user-entered ASCII description of 
atoms (e.g., M2(<H>) for all hydrogen atoms within molecule M2) 
into internally valid representations of molecule and atoms. 

SYB_HBOND_DONORS - returns the set of IDs for atoms which are 
hydrogen-bonding hydrogens. 

TAILOR_STORE_IT_HERE - returns the current value of a user- (and 
SPL-) accessible variable. 

TBL.ACCESS„INDEX_TO_COLNAME - converts a user-provided MSS 
column ID to a column name (name is guaranteed to be a unique 
identifier). 

TBL__GRAB_COMPLETE_FPTS - done returning multiple (scalar) values 
in an MSS column to an array. 

TBL_GRAB_GET_FPTS_INV - in a multiple value retrieval, returns the 
value corresponding to a user-provided row ID. 

TBL_GRAB JNIT„FPTS - set up for returning multiple (scalar) values 
in an MSS column to an array. 

UBS„OUTPUT_MESSAGE - equivalent to fprintfO 

UIMS2_VAR_GET_TOKEN - returns the current value of a global SPL 
variable. 

UIMS2_WRITE„ERR0R - writes text to the error output stream. 

UTL_FILE„FCLOSE, UTL_FILE_FOPEN - equivalent to fclose() and 
fopen(). 

UTL_LIST_RETRIEVE - returns the next element on a linked list. 
UTL_MEM„ALLOC - equivalent to malloc(). 
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UTL_SET_AND_INPLACE.- makes the first set logically equivalent to 
the second set, with only those bits that are also 1 in the third set 
becoming 1 in the first set. 

UTL_SET_CARDINALITY - returns the number of bits that are 1 in a 
particular bitset. 

UTL_SET_CLEAR - sets all bits in the set to 0. 

UTL_SET_COPY_INPLACE - makes the first set logically identical to 
the second. 

UTL_SET_CREATE - creates and returns an empty set of requested 
size. 

UTL_SET_DELETE - sets the specified bit to 0. 

UTL_SET_DIFF_INPLACE - makes the first set logically equivalent to 
the second set, with all bits that are 1 in the third set becoming 0 in 
the first set. 

UTL_SET_EMPTY - TRUE if all bits in the set are 0. 
UTL_SET_INSERT - sets the requested bit to 1. 

UTL_SET_MEMBER - returns TRUE if the requested set bit equals 1. 

UTL_SET_NEXT - returns the identity of the next non-zero bit in a 
set. 

UTL_SET_OR_INPLACE - makes the first set logically equivalent to 
the second set, with all bits that are 1 in the third set becoming 1 in 
the first set. 

UTL_STR_CMP_NOCASE - non-case sensitive version of strcmp(). 
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APPENDIX "B 



/* CODE. This code implements a PHORE_LOC column type and calculates a single 
cell value (the Hydrogen Bonding Fingerprint for a molecule) within the SYBYL 
Molecular Spreadsheet. It is to be understood that other supporting code handles 
user inputs user output, and disk file I/O. */ 

/* data structure for PH0RE_L0C column type */ 
typedef 

struct PHORE { 

char *disco_fn; /* user name for DISCO feature file - default 
appears below */ 

int disco_in; /* internal flag if DISCO feature file loaded */ 
char *region_fn; /* user name for defining region file */ 
RegionPtr rgn; /* internal reference to region when loaded */ 
int nfu2z; /* number of extra lattice points (each direction) 

for each PHORE feature */ 

int nbits; /* set length (must agree with rgn contents or EVAL 

fails) */ 

l^i PHORE, *PPHORE; 

/*+b|qsar_proc_eyalj>hore_loc */ 

************* ************** / 
/* .2 int QSAR_PROC_EVAL_PHORE_LOC(tablename, row, colname) */ 

/* M */ 
/* Dick Cramer 31-Jul-95 (PHORE^LOC lattice bitset ) */ 

/* L */ 

/* j^This module generates bitsets whose cardinality is equal to */ 

/* ^lattice points x 2 (# of sitepoint classes. For each */ 

/* ^instance of a pharmacophoric point in the molecule being */ 

/* ^processed, the geometrically nearest (l+m)^3 bits in the */ 

/* Ubitset will be set to 1 (where m is user supplied). */ 

/* */ 

/* NOTE: this routine explicitly requires that sets begin after a */ 

/* first element that is the set size! 1 ! */ 

/* */ 
/* Inputs */ 

/* */ 
/* Outputs */ 

/* */ 
/* User Required Definition Files */ 

/* */ 

int QSAR_PROC_EVAL_PHORE_LOC ( tablename , row , colname ) 
char *tablename, *colname; 
int row; 
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{ 

mol^ptr mol; 

PPHORE phr; 

int err, status, nvalid, mol^area; 

char *duin; 

set_ptr print , qsar_proc_calc_phore__set ( ) ; 
FILE *fp; 

/* get the molecule */ 

if ( !TBL_UTL_GET_MOLECULE(tablename, row, FALSE, &inol) ) 
{ 

if ( UTL_ERROR_IS_SET{) ) {err=l; goto 

error ; } 

else return FALSE; 

} 

/* get the user-provided input data */ 

if ( !TBL_ATTR_FIND_COLUMN_A(tablename, colname, "PROC_SUPPORT" , Scdum, 

(int *)&phr) ) {err=3; goto 

error ; } 

/* titrieve DISCO stuff if not yet present */ 
#f ( 1 phr->disco_in) { 

-^4if ( !phr->disco_fn) {err=l; goto error;} 
/* set appropriate tailor value, then initialize DISCO */ 

Cfisprintf( str, "SETVAR TAILORl DISCO i FILE %s", phr->disco_f n ); 

S|UIMS2_EXEC_C0MMAND( str ); 

i:3DIMS2_EXEC_COMMAND( "DISCO INIT" ); 

friphr->disco_in = TRUE; 

^} 

/* ^Btrieve region if not yet present */ 
f|f (iphr->rgn ) { 

rg if ( !phr->region_fn) {err=l; goto error;} 

if { ! {phr->rgn = QSARJREGION_RETRIEVE( phr->region_f n ) )) 
{erf^4;goto error;} 

if (phr->rgn->n_boxes > 1 ) { 

sprintf( str, "WARNING: Region %s has %d boxes. Only first 
will be used. \n", 

phr->region_f n, phr->rgn->n_boxes ); 
UBS_OUTPUT_MESSAGE ( stdout, str ); 

} 

phr->nbits = 2 * phr->rgn->n_points; 

} 

/* evaluate this result, first the DISCO call */ 

if (!( print = qsar__proc_calc_phore_set ( itiol, phr, Snvalid )) ) {err=12; 
goto error;} 

/* go store both the bitset in the MSS "Cell_Support" and the number of bits 
actually set in the "CELL", so there's something for the user to see */ 
if { iTBL_ACCESS_X_PUT_VALUE(tablename, row, colname, "CELL_SUPPORT", 

(int *)&print) ) {err=ll; goto error;} 
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if ( lTBL_ACCESS_X_PUT_VALUE(tablenaine, row, colname, "CELL", 

(int *) invalid) ) {err=ll; goto 

error;} 

return TRUE; 

error : 

sprintf (str, "QSAR_PROC_EVAL_PHORE LOG (%d)", err); 
UTL_ERROR_ADD_TRACE (str) ; 
return FALSE; 



set_ptr qsar_proc_calc_phore_set( mol, phr, nvalid ) 
/* creates actual bitset */ 

inol_ptr mol; 

PPHORE phr; 

int invalid; 

{ 

set_ptr anset = NIL, pset = NIL, SYB_FEAT_FIND_ID_SET ( ) ; 
f eat_ptr f eatp , SYB_FEAT_FIND_REC ( ) ; 
atpm_ptr a, SYB_ATOM_FIND_REC ( ) ; 

iit err, elem, sitebase, ci, xybase, boff, It base[3], It off [3], loff = 
0, Mloff == 0 ; ~ 
f|>t tmp; 

B^cPtr bxptr; 
lfne_ptr cdp; 

Caif (i( anset = UTL_SET_CREATE ( phr->nbits ) )) {err = 1; goto error;} 

rgfcnvalid = 0; 

Hi if (phr->nfu2z) { 

n loff -= phr->nfuzz / 2; 

rO hioff += (phr->nfu2z + l ) / 2; 

r:| 

|l|)xptr = phr->rgn->box_array; 

f^cybase = bxptr->nstep [ 0 ] * bxptr ">nstep [ 1 ] ; 

/* generate the DISCO sites for this molecule, which */ 
UIMS2_EXEC_C0MMAND( "ECHO %DISCO_SITES ( ) " ); 

/*•. become "FEATURES" + "dummy atoms" within SYBYL's molecule data 
structure */ 

pset = SYB_FEAT_FIND_ID_SET (mol, FEAT_V_LINE, 1, mol->nfeats) ; 
if (pset ) { 
elem = -1; 

while( (elem = UTL_SET_NEXT (pset, elem) ) != N0_MORE_ELEM) { 
if (!(featp SYB_FEAT_FIND_REC (mol, elem))) goto error; 
if ({featp->name[l] == 'S') && (f eatp->name[2] == { 
/* have an H-bonding feature, it must represent a line */ 

sitebase = f eatp->name[0] 'A' ? 0 : phr->rgn->n_points; 
/* the dummy atom at the end of the line is our H-bonding locus */ 
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cdp = (linejptr) f eatp->dataptr ; 

if {!(a = SYB_ATOM_FIND_REC (mol, cdp->positn) ) ) {err=2; goto 

error;} 

for (ci = 0; ci < 3; ci++ ) { 

tmp = (a->xyz[ci] - bxptr->lo[ci] ) / bxptr->stepsizeCci] ; 
lt_base[ci] = (int) (tmp < 0.0 ? tmp - bxptr->stepsize[ci] : 

tmp ) ; 

} 

/* cycle through all points touched by this locus that are also within the 
region */ 

for (lt_off[0] = It base[0] + loff; It off[0] <= It basefO] + hioff; 
lt_offCO]++) ~ ~ _ t J / 

if (lt_off[0] >= 0 && lt_off[0] < bxptr->nstep[0]) 

for (lt_off[l] = It baseCl] + loff; It off[l] <= It basefl] + 
hioff; lt_off [1]++) ~ - - 

if (lt_off[13 >= 0 && lt_off[l] < bxptr->nstep[l]) 

for {lt_off[23 = It base[2] + loff; It off [2] <= It base[2] + 
hioff; lt_off[2]++) 

if (lt_off [2] >= 0 && lt_off [2] < bxptr->nstep[2] ) { 
boff = xybase * lt_off [2] + 
y (bxptr -> nstep[0]) * lt_off[l] + 

'-3 lt_off[0] + sitebase; 

UTL_SET_INSERT ( anset, boff ); 
M (*nvalid)++; 

m 

sUTL_SET_DESTROY { pset ) ; 
Ct /* pset exists */ 

■ ru . 

i;feturn( anset ) ; 
erroy: 

[fprintf (str, "qsar_proc_calc_phore_set (%d) " , err); 
''UTL_ERROR_ADD_TRACE (str) ; 
return FALSE ; 

} 



# This file determines the recognition of site points in Sybyl/DISCO. 

# See the SYBYL DISCO manual for detailed documentation. The defined types 
are 



# (1) HE : the QUERY is searched in the SEARCH mode, and all occurences 

# are assigned DISCO features according to the remaining 

# specifications — the three ATOMS refer to the atom number 

# in QUERY such that the feature is DIST from the first atom 

# at bond ANGLE with the first and" "second atom at each of the 

# TORSIONS formed by the site point and the three ATOMS in order. 

# A sitepoint of NAME is added at these extension points, 

# — and — the first atom is assigned a feature complimentary 
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# 
# 
# 
# 
# 



to the extension point (such as HBD_CO_ and RHBD_CO_) . 
(2) HBex: differs from HE in that the angles and torsions are replaced 
by two other arguments: whether lone pairs are part of the 
extension point placement, and which ATYPE (generally LP 
and/or H) determine the direction of the sitepoints. 



#TYPE NAME ATOMS SEARCH DIST ANGLE TORSIONS 



QUERY 



HE DS_02C2_ 

HE DS_03Car_ 

HE DS_03Car_ 

HE DS_03Car_ 

HE DS_03Car_ 

HE DS_03Car_ 

HE DS_03C3_ 1 
0[f]HC(Any) (Any) C (Any) (Any) Any 

HE DS_N3C3_ 14 5 NoDup 2.9 

HE DS 02S 3 2 1 All 2.9 120 



4 2 1 NoDup 2.9 120 "0.0 180.0" HevC(Any)=0[f ] 
13 4 All 2.9 119 "0.0 180.0" 0[f ]HC( :Hev) :Hev 

119 "0.0 180.0" 0[f ]C(:Hev) :Hev 
9 119 "0.0 180.0" 0[f]HC(=0) 
9 119 "0.0 180.0" 0[f]C(=0) 
120 "0.0 180.0" C(:0[f ]) :0[f ] 
117 "60 180 300" 



1 
1 
1 
2 
3 



3 All 2.9 

4 NoDup 2. 
3 NoDup 2. 
3 All 2.9 



6 NoDup 2.9 



#TYPE NAME ATOMS SEARCH DIST LP ATYPE Query 



110 "60 180 300" N[f 3H2ZC{Z:C&lC=0&!C:Hev} 
"0.0 180" AnyS(=0) (=0)NH 



HEeig DS_03C3_ 2 13 NoDup 2 . 9 YES "LP H" 
O [ f fHC (Any) (Any) Z { Z : Hev& I C (Any) (Any) Any} 
HBek4DS_03C3_ 3 12 NoDup 2.9 YES "LP" 
HBexl DS_N3C3_ 2 14 Nodup 2.9 "" "H" 
NCfp2YaZ{Z:Hev&lC}{Ya:C&!C=0&!C:Hev} 



0[f ] (Z) Z{Z:C&!C=Het} 



HBesM DS_N3C3_ 2 13 NoDup 2.9 
HBe0DS_N3C3_ 3 12 NoDup 2.9 
N[fitYa) (Ya) Ya{Ya:C&iC=0&!C:Hev} 



YES "LP H" N[f ]H(Ya)Ya{Ya:C&!C=0&iC:Hev} 
YES "LP" 



HBex DS_N2C2 
HBeCi DS_N2C2" 
HBexj DS_N2C2' 
HBeij DS_N2N2' 
HBeMDS_N2N2' 
HEexj DS_N2N2' 
hb uDS_03S_" 
hb ' DS_03S_ 
hb DS_03S_ 
hb DS_03N_ 
hb DS_02N_ 
hbex DS_N2N2 
hb DS_03P_' 
hb DS 03P 



2 13 NoDup 3.0 
12 3 NoDup 3.0 
12 3 NoDup 3.0 

2 13 NoTriv 3, 

2 13 NoTriv 3. 

3 2 1 NoDup 3. 

3 2 1 NoDup 2.9 

4 2 1 All 2.9 
4 2 1 All 2.9 

3 2 4 All 2.9 

4 2 1 NoDup 2.9 
3 2 1 NoDup 3, 
3 12 All 2.9 

3 12 All 2.9 



YES "H LP" N[f]H=C 
YES "H LP" Any~N[f 
YES "LP" Any~N[r 
0 YES "LP H" N[1]H: 
0 YES "LP H" N[13H: 
0 YES "LP" C:N[f] 

128 "0.0 180.0" 
128 "0.0 180.0" 
128 "0.0 180.0" 
128 "0.0 180.0" 

128 "0.0 180.0" 
0 YES "LP" 
128 "0.0 180.0" 
128 "0.0 180.0" 



# #CLASSNAMES# Acceptor_site Donor_Atom DL 
HB AS_H03C2_ 13 4 All 2.9 119 "0.0 18C 
HB AS_H03C3_ 13 6 NoDup 2.9 
0[f]HC(Any) (Any) C (Any) (Any) Any 
HB AS_N3C3_ 14 7 NoDup 2.9 
N[f]H2C{Any) (Any) C (Any) (Any) Any 
HB AS_N3C3_ 15 8 NoDup 2.9 
N[f]H3C(Any) (Any)C(Any) (Any) Any 
#TYPE NAME ATOMS SEARCH DIST LP 



]=C 
]=C[r] 

C:C:N[f ] :C:@1 
C:C:N[f ] :C:@1 
:Hev 

HevS=0[f ] 
HevS(=0[f ])=0[f] 
HevS(~0[f3) ('-0Cf])~0[f] 
HevN(0[f])0[f] 
HevN(Hev) ~0[f ] 
N:N[f ] :N 
P(~0) (~0) (~0) (-0) 
P(~0) (-0) (~0) 



119 "0.0 180.0" 0[f ]HC(:Hev) :Hev 
117 "60 180 300" 

110 "60 180 300" 

110 "60 180 300" 



ATYPE Query 
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HBex AS_HN2C2_ 2 13 NoDup 3.0 

HBex AS_HN2C2_ 3 2 1 NoDup 3.0 

HBex AS_HN2C2_ 6 5 4 NoTriv 3.0 

HBex AS H03C3 2 13 NoDup 2.9 



ini iiHii NHC(Any)=0[f ] 

YES "LP H" C:N[f3H:Hev 
YES "LP" N[l]H:C:C:N[f ] :C:@1 

YES "LP H" 



O [ f 3 HC (Any) (Any ) Z { Z : HevS ! C (Any) (Any) Any} 

HBex AS_HN2C2_ 3 2 4 Nodup 3.0 YES "LP H" HevN[f]H=C 
HBex AS_HN2C2_ 12 3 Nodup 3.0 YES "LP" HevN[f]=C 
HBex AS_HN2C2_ 2 14 Nodup 3.0 "" "H" N[f]H2C(N)=N 
HBex AS_N3C3_ 2 14 Nodup 2.9 YES "LP H" 
N[f]H2C(Any) (Any) Z{Z:Hev&!C{Any) (Any) Any} 
HBex AS_N3C3_ 2 15 Nodup 2 . 9 YES "LP H" 
N[f]H3C(Any) (Any) Z{Z:Hev&!C(Any) (Any)Any} 



AS 

as" 
as" 
as" 
as" 
as" 



HBex 
HBex 
HBex 
HBex 
HBex 
HBex 
HBe|Ej AS 
HBeJh as' 
HBexl as" 
HBe3C= as' 
HBe?i hS\ 
hbef ; as" 
hb 2 AS 
hb S as' 



N3C3_ 
N3C3_ 
N3C3_ 
■N3C3_ 
■HN2C2_ 
HN2C2_ 
HN2C2_ 
HN2C2_ 
■hN2C2_ 
HNS3_ 
■HN4_ 2 
"HN2N2_ 
"03P_ 
"03P 



1 
1 
1 
1 
2 
3 
2 
2 
1 



H" N[f ]H(Ya) Ya{Ya;C&!C=0&!C:Hev} 
H" N[f ]H2 (Ya) Ya{Ya:C&!C=0&!C:Hev} 
H" N[f]H(Ya) (Ya)Ya{Ya:C&!C=0&!C:Hev} 



NoDup 2.9 YES "LP 
NoDup 2.9 YES "LP 
NoDup 2.9 YES "LP 

NoDup 2.9 YES "LP" N[f ] (Ya) (Ya) Ya{Ya:C&!C=0&lC:Hev} 
3.0 YES "H LP" N[f]H=C 
"LP" N[f]=C~Any 
"H" N [ f ] H2Hev ( : Hev) : Rev 
"H" N [ f ] HHev ( : Hev) : Hev 
" "H" HNC=Any 

"H" • AnyS(=0) (=0)N[f ]H 
"C*" N[f] (Z) (Z) (Z)Z{Z:C&lC=0&!C:Hev} 
YES "LP" N:N[f]:N 
"0.0 180.0" P(~0) (~0) (~0) (~0) 



3 NoDup 3.0 

2 NoDup 3.0 

4 NoDup 3.0 

3 NoDup 3.0 
3 NoDup 3.0 

6 5 2 NoDup 3.0 " 
1 3 NoDup -3.6 "" 
3 2 1 NoDup 3.0 
3 12 All 2.9 128 
3 12 All 2.9 128 



YES 

YES 
II II 



It It 



"0.0 180.0" 



P(-0) (~0) (~0) 
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APPENDIX "C" 



EXPERIMENTAL DATA SETS 




Data Set 


No. Of Cpds 


Structure, Activity 


1 Uehling 


9 


camptothecin, DNA fragmentation 


2 Strupczewsld 


34 


benzisoxazoles, ip Behavioral 


3 Siddiqi 


10 


adenosines. Brain Al binding 


4 Gamttl 


10 


tryptamines, melanophore binding 


5 Garratt2 


14 


trvDtamines. melanoDhore binding 


6 Heyl 


11 


deltorphin, opioid receotor (DAMGO^I 


7 Cristalli 


32 


adenosines, A2a agonists 


8 Stevenson 


5 


piperidines, NKl antagonism 


9 Doherty 


6 


triarylbutenolides, endothelin-A an tag. 


10 Penning 


13 


SC-41930 analogs, LTB4 antagonism 


11 Lewis 


7 


oxazolinediones, NKl binding 


12 Krystek 


30 


sulfonamides, endothelin-A antagonism 


13 Yokoyamal 


13 


oxamic acids, T3 binding 


14 Yokoyama2 


12 


oxamic acids, T3 binding 


15 Svensson 


13 


benzindoles, 5-HTA agonism 


16 Tsutsumi 


13 


peptidyl heterocycles, endopeptidase inhib 


17 Chang 


34 


biphenyl sulfonamides, ATI binding 


18 Rosowsky 


10 


trimetrexate analogs, DHFR inhibition 


19 Thompson 


8 


peptidomimetic, HIV-1 protease inhibition 


20 Depreux 


26 


naphthylethyl amides, melatonin displ. 



Literature References for Data Sets: 

1. Uehling, D.E., Nanthakamur, S.S., Croom, D., Emerson, D.L., Leitner, P.P., 
Luzzio, M.J., et al. Synthesis, Topoisomerase I Inhibitory Activity, and in Vivo 
Evaluation of U-Azacamptothecin Analogs. J. Med. Chem. 1995, 38, 1106 (Table 2, 
with R2=Et; IC50 data. 

2 Strupczewski, J.T., Bordeau, K.J., Chiang, Y., Glamkowski, E.J., Conway, P.G., et 
al 3-[[(aryloxy)alkyl]piperidinyl]-l,2-BenzisoxazoIes as D2/5-HT2 Antagonists with 
Potential Atypical Antipsychotic Activity: Antipsychotic Profile of Iloperidone 
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(HP873). /. Med. Chem. 1995, 38, 1119. (Tables 2 and 3 with n=3, X=0; ED50 for 
inhibition of apomorphine-induced climbing.) 

3. Siddiqi, S.M., Jacobson, K.A., Esker, J.L., Olah, M.E., Ji, Xi.-duc, et al. Search 
for New Purine- and Ribose-Modified Adenosine Analogs as Selective Agonists and 
Antagonists at Adenosine Receptors. /. Med. Chem. 1995, 38, 1174. (Table 1, 
R2=H; Ki(Al), values estimated from % displacement and stereoisomers averaged as 
needed.) 

4. Garratt, P. J., Jones, R., Tocher, D. A., Sugden, D., Mapping the Melatonin 
Receptor. 3. Design and Synthesis of Melatonin Agonists and Antagonists Derived 

: from 2-Phenyltryptamines. J. Med. Chem. 1995, 38, 1132. (Table 1 and Table 2). 

5. Garratt, P. J., Jones, R., Tocher, D. A., Sugden, D., Mapping the Melatonin 
Receptor. 3. Design and Synthesis of Melatonin Agonists and Antagonists Derived 
from 2-Phenyltryptamines. J. Med. Chem. 1995, 38, 1132. (Table 1 and Table 2). 

6. Heyl, D.L., Dandabuthla, M., Kurtz, K.R., Mousigian, C. Opioid Receptor Binding 
Requirements for the &-Selective Peptide Deltorphin I: Phe' Replacement with Ring- 
Substituted and Heterocyclic Amino Acids. J. Med. Chem. 1995, 38, 1242. (Table 1; 
binding Ki to DAMGO.) 

7. Cristalli, G., Camaioni, E., Vittori, S., Volpini, R., Borea, P.A., et al. 2-Aralkynyl 
and 2-Heteroalkynyl Derivatives of Adenosine-5'-N-ethyluronamide as Selective A2a 
Adenosine Receptor Agonists. J. Med. Chem. 1995, 38, 1462. 

8. Stevenson, G.I., MacLeod, A.M., Huscroft, I., Cascieri, M.A., Sadowski, S., 
Baker, R. 4,4-Disubstituted Piperidines: A New Class of NKj Antagonist. J. Med. 
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Chem. 1995, 38, 1264. (Table 1.) 

9. Doherty, A.M., Patt, W.C., Edmunds, J.J. Berryman, K.A., Reisdorph, B.R., et al 
Discovery of a Novel Series of Orally Active Non-Peptide Endothelin-A (ETJ 
Receptor-Selective Antagonists. 7. Med. Chem. 1995, 38, 1259. (Table 3; IC50 ET^.) 

10. Penning, T.D., Djuric, S.W., Miyashiro, J.M., Yu, S., Snyder, J.P., et al Second- 
Generation Leukotriene B4 Receptor Antagonists Related to SC-41930; Heterocyclic 
Replacement of the Methyl Ketone Pharmacophore. /. Med. Chem. 1995, 38, 858. 
(Table 1, all; LTB4 receptor binding.) 

11. Lewis, R.T., MacLeod, A.M., Merchant, K.J. Kelleher, F., Sanderson, I., et al. 
Tryptophan-Derived NKl Antagonists: Conformationally Constrained Heterocyclic 
Bioisosteres of the Ester Linkage. /. Med. Chem. 1995, 28, 923. 

12. Krystek, S.R., Hunt, J.T., Stein, P.D., Stouch, T.R. 3D-QSAR of Sulfonamide 
Endothelin Inhibitors. J. Med. Chem. 1995, 38, 659. 

13. Yokoyama, N., Walker, G.N., Main, A.J. Stanton, J.L. Morrissey, M., et al. 
Synthesis and SAR of Oxamic Acid and Acetic Acid Derivatives Related to L- 
Thyronine. 7. Med. Chem. 1995, 38, 695. 

14. Yokoyama, N., Walker, G.N., Main, A.J. Stanton, J.L. Morrissey, M., et al. 
Synthesis and SAR of Oxamic Acid and Acetic Acid Derivatives Related to L- 
Thyronine. 7. Med. Chem. 1995, 38, 695. 

15. Haadsma-Svensson, S.R., Svensson, K., Duncan, N., Smith, M.W., Lin, Ch.-H. C-9 
and N-Substituted Analogs of cis-(3aR)-(-)-2,3,3a',4,5,9b-Hexahydro-3-propyl-lH- 
benz[e]indole-9-carboxamide: 5HT1A Receptor Agonists with Various Degrees of 
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Metabolic Stability. J. Med. Chem. 1995, 38, 725. 

16. Tsutsumi, S., Okonogi, T. Shibahara, S., Ohuchi, S., Hatsushiba, E., et al. 
Synthesis and Structure Activity Relationships of Peptidyl @-Keto Heterocycles as 
Novel Inhibitors of Prolyl Endopeptidase. J. Med. Chem. 1994, 37, 3492. (Table 2, 
X=CH2CH2;IC5o.) 

17. Chang, L.L., Ashton, W.T., Flanagan, K.L., Chen, Ts.-Bau., O'Malley, S.S., et al, 
Triazolinone Biphenylsulfonamides as Angiotensin II Receptor Antagonists with High 
Affinity for Both the ATj and AT2 Subtypes. J. Med. Chem. , 1994, 37, 4464. (Table 
1, R^ =(2-Cl)C6H5; ATi [rabbit aorta] IC50.) 

18. Rosowsky, A., Mota, C.E., Wright, J.E., Queener, S.F., 2,4-Diamino-5- 
chloroquinazoline Analogs of Trimetrexate and Piritrexim: Synthesis and Antifolate 
Activity. /. Med. Chem. 1994, 37, 4522. (Table 2; rat liver IC50.) 

19. Thompson, S.K., Murthy, K.H.M., Zhao, B., Winborne, E., Green, D.W., et al. 
Rational Design, Synthesis, and Crystallographic Analysis of a Hydroxyethylene- 
Based fflV-l Protease Inhibitor Containing a Heterocyclic Pl'-P2' Amide Bond 
Isostere. J. Med. Chem. 1994, 37, 3100. (Table 2, X-Boc; apparent K;.) 

20. Depreux, P., Lesieur, D., Mansour, H.A., Morgan, P., et al. Synthesis and 
Structure- Activity Relationships of Novel Naphthalenic and Bioisosteric Related 
Amidic Derivatives as Melatonin Receptor Ligands. J. Med. Chem. 1994, 37, 3231. 
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APPENDIX "D" 

A list of 736 commercially available thiols broken down into 231 clusters based on topomeric 
CoMFA field descriptors along with the systematic name applicable to each. The 231 clusters 
are sorted by proposed name, first by the "root" structure, ie., the fragment attached 
immediately to the -SH, and then by the substitution pattern on that "root" substructure. The 
names describe topologically equivalent hydrocarbons, ie., structures in which all monovalent 
atoms are replaced by hydrogens and the other atoms by carbons. 
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JLU 


oxze 




X 




3.ryl 


1 A A 


X 


sryl 


1 in 


X 


3.ryl 


163^ 


1 


aryl 


151 


1 


aryl 


33 


5 


aryl 


80 


2 


aryl 


192 


1 


aryl 


7 


14 


aryl 


27 


6 


aryl 


107 


2 


aryl 


189 


1 


aryl 


141 


1 


aryl 


205 


1 


aryl 


188 


1 


aryl 


56 


3 


aryl 


138 


1 


aryl 


190 


1 


aryl 


41 


6 • 


aryl 


152 


1 


aryl 


16 


9 


aryl 


85 


2 


aryl 


106 


2 


aryl 


77 


2 


aryl 


142 


1 


aryl 


121 


2 


aryl 


97 


2 


aryl 


218 


1 


aryl 


164 


1 


aryl 


98 


2 


aryl 


99 


3 


aryl 


157 


1 


aryl 


58 


3 


aryl 


100 


2 


aryl 


37 


5 


aryl 


180 


1 


aryl 


199 


1 


aryl 
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1 


aryl 


115 


2 


aryl 


193 


1 


aryl 


67 


3 


aryl 


129 


2 


aryl 


46 


4 


aryl 


155 


1 


aryl 


82 


2 


aryl 


10 


16 


aryl 



Structural 
Substitution^ 

Simple 

2,3,5-Me 

2,3,5-Me-4-Pr 

2 , 3- (4- (2 , 3-Pr) 5het ) 5hetO 

2,3 - (4-Bu) 5hetO-5-Me 

2 . 3- Benzo 
2,5-Me 

2.5- Me-3-iPe 

2. 6- N0H-3 (4/5) -Me 
2,6-NoH-3-Ar 

2- (2-Bz)PheEt-4, 5-Benzo 

2- (3,5-Me) Ar-4, 5-Ben2o 

2-(4-Et)PhePr 

2- (4-Stilbenyl) Stilbenyl 

2-5hetCH2-4, 5-Benzo 

2-Ar 

2-Ar-3,5-Me 

2-Ar-4, 5- {3, 4-Et) Benzo 

2-Ar-4 , 5-Benzo 

2-Bz 

2-Et 

2-NoH-3-Et-5-Me 
2-PheEt-4, 5-Benzo 
2-PhePr 
2-R8 

2 - Stilbenyl 

3.4- (3-Me)Benzo 
3 , 4- {a,b) IndenO 

3,4- (a, b, ( 8 -Ar) IndenO} -6 -Me 
3,4-{a,b, (c-Me) IndenO) 
3,4-(a,b-Naphtho) 

3.4- Ar 

3 , 4-Benzo-5-Me 
3 , 4-Benzo-6-tBu 

3.5- Me 

3- {2,3-Benzo-4-Et) 5het 
3- {2 , 3-Benzo-5-Me) 5het 
3- {2-Me-3-5het-5-Et) 5het 
3- {3-5het) 5het 
3-{3-Ar)5het-4-Me 

3-Ar 

3-Ar-4- {2-Me) 5hetCH2 

3-Ar -5 -Me 

3-Bz 

3-BZ-5 , 6-Benzo 
3 -Me 



70 


3 


aT*vl 
c JL jr lb 


3 -Nanht"h 

XVCL^llL.11 


73 


3 


CLX, Jf X 




95 


2 


3.3ryl 


3-iPr 


88 


2 


aryl 


4-Ar 


81 


2 


aryl 


4-Bz 


48 


4 


aryl 


4-Et 


2 


23 


aryl 


4 -Me 


92 


2 


aryl 


4-R9 + 


90 


4 


aryl 


4-iBu 


19 


8 


aryl 


6-NoH 


148^ 


1 




\ ClV.itSiiVjoXllt2 } 


228 
^ £i \j 


1 


T**vr1 

CIX X 


\ X X U.WX crov^t:;Xli / 


12 


10 

X w 


IXC? L> 


o xxu^ X cr 


50 
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1 

X 




9 ? -R"h 01-0-4 -Mo 
^ / ^ ^iis?u>v,y *± lie: 


89 


9 

£s 


lie ^ 


'9 ?-A-r 
^ / ^ r^x 




1 
X 






69 


■J 


./lit:? L. 


9-/9 -Mo^ A-r-*^ - f 9 -Mo^ PHol^i- 


198 


1 

X 




9- ^9-Mo^ A-r-'^-Pl n 
^ V iXlts / Ax ~ J — i\x U 


174. 


1 

X 






171 


1 

X 




9_ / "5 CT— Mo\ a-r — — RViial- 


17 0 
X / u 


i 

X 




z — V -5 / D— rie; oZ~j / 4t— iienzo 


X i£« ^ 




S*h /af- 
file? L. 






1 




9 — / 4 — T?f- \ A V 
Z — V f± ii L. ^ i\x 


^ u ^ 


1 
X 




9— /4 — TT-f-^ Ar' — A— ^A— Av 


199 
X ^ ^ 






Z— \ ^ — xJrx j /ir— J — I3Z 


1 Q7 

X-7 / 


1 
X 




Z— DlieUV^itZ J— L.DU;iir 


c 
o 


1 4 

X 




9 — A r- 
Z iix 


99 =i 
^ ^ .ij 


1 
X 




9 — A-r — — ^9 — Ar'\ crVttfil-'Rn 
Z fix— J — ^Z— H.X / OiieuJDU 


994 


1 

X 




9— A-r — ^9 — Ar*^ mno1-r'M9 
^ rlx J ^ Z rix / jiitr U\-.riZ 


o o 


-3 




9— Ar*— 7 — ^9— \ Ay* 
Z —iiX — ^ — V Z — OZ / i\x 


178 

X / o 


9 


^ lies ^ 


9_A-r — ^9-Mo\ m-toi- 


72 


3 


^11^ L« 




4.0 


.J 


Dll^ L. 


9— At~ — '3— /'3_A>*\ tr£=i-f- T^■H 
Z— Ax— J— Ax / DrieuiiiU 


1 8^ 
X o o 


1 

X 


jiieu 
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64 


3 


«j lie u 


^ AX J \ J AX J Ficr / J lies U 


105 


2 


5het 


2-Ar-3- (3-Me) Ar 


160 


1 


5het 


2-Ar-3- (4-Ar) Cyhx 


145 


1 


5het 


2 - Ar- 3 - ( 4 - Ar ) CyhxCH2 


203 


1 


5het 


2-Ar-3- (4-PheEt) Ar 


126 


2 


5het 


2-Ar-3- (tBu) Ar 


17 


9 


5het 


2-Ar-3-Ar 


211^ 


1 


5het 


9 — At* — — I^OTi "7^/1 1 HoTJo 

^ £^Am -J Udl^J< XXVACZllC? 


124 


2 


5het 


2 -Ar-3 -IndenCH2 


7 fib 




^11^ u 


9 — A — —Mo 

Z Ax J ixie 




6 


^ lic= U 


Z — Ax J — rTieirr 


204 


1 


5het 


2-Ar-5-(4-f2 4-Mo^Bz^A-r 


79 


2 


5het 


2-Bz 


78 


2 


5het 


2-BZ-3 , 4-Benzo 


117 


2 


5het 


2 -Cyhx 


185 


1 


5het 


2-Cyhx-3,4-iPe 


68 


3 


5het 


2-Et 


112 


2 


5het 


2-Et-3-(2-Me)PheEt 



128 


2 


5het 


2-Me-3,4- (3-Me)Benzo 


93 


2 


5het 


2-Me-3,4-Benzo 


61 


3 


5het 


2-Me-3- (2, 3 , 4-Me) Shet 


181 


1 


5het 


2-Me-3- (2, 3-Benzo-4-Et) Shet 


49 


4 


5het 


2-Me-3- (3-Ar) Shet 


86 


2 


5het 


2 -Me-3 - ( 3 -Ar ) ShetPr 


91 


2 


5het 
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4 


17 


Shet 


2-Me-3- (3-Bz)Ar 


172 


1 
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38 


5 
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13 


10 
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1 


Shet 
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29 
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71 
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54 
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96 
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94 
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1 


Shet 


3,4- (a,b-Napththo) 


3 6 


15 


Shet 


3 , 4 -Benzo 
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42 


4 
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1 
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3- (4-Me) Ar 
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1 
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3- (B-Ar) PhePr 


114 
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18 


8 
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3-Ar 


59 


3 
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65 


3 
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3-Bu 
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7 
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3-Me-5-H 


44 


6 


Shet 


3-Me-5-NoH 


52 


5 


Shet 


3-Pe 


111 


2 


Shet 


3 -PheEt 


153 


1 


Shet 


3-PhePr 




6 


Shet 


3-Pr 


223 


1 


Shet 


3-R13 


185 


1 


Shet 


(chrysenO) 


34 


5 


alkyl 


Siniple 


104 


2 


alkyl 


(3) (Bl) (Bl) 


62 
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alkyl 


(3 -Me) PhePr 
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18 
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alkyl 
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alkyl 


47 
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alkyl 


103 


2 


alkyl 


76 


2 


alkyl 


83 
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alkyl 


215 


1 


alkyl 


43 


8 


alkyl 
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15 


alkyl 


158 
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alkyl 
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alkyl 


166 


1 


alkyl 


53 
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2 


alkenvT 




101 


2 


c vc 1 ohexvl 




149 


1 


c VC 1 ohexvl 


X ^ / tz \^XX^^ 


55 


3 


c VC 1 oh ^xvl 


'2 3 4 5-tRii 

^ f ^ f '•± f ^ X^iwL 


147 


1 


c VC 1_ oli^xvl 




209 


1 


cvcl oln^^ifvl 


^ \ J / *± xTiitr U ^ j lifer L. O i'lS 


208 


1 

J- 


^ Jf J- wxic-?vy X 




167 


1 


V-* X wxx^^^jr X 




165 


1 


cyclohexyl 


2-iPr-3,5-Me 


150 


1 


cyclohexyl 


3-sPe-6-Me 


161 


1 


cyclohexyl 


4-Et-4-iBu 


219 


1 


cyclohexyl 


(complex) 


175 


1 


cyclopentyl 


2-Ar-4-spiro 


215 


1 


cyclopentyl 


3-PhePr 



^To generate these names, all heteroatoms are first replaced bv 
carbon (to produce the simplest common topology) and a particular 
structure is chosen from among these topologies as the "most typical" 
of that cluster, if possible to contain the largest substructure that 
distinguishes that cluster from all others. 

Within the name of a substitution, numbers indicate positions when 
substitution is on a ring, but chain length when substitution is on a 
chain (numbers separated by a colon indicate a range of chain 
lengths). Also, within a chain, letters indicate a position of 
substitution. (For example, (C2) describes a two atom branching from 
the third position of a chain, while 3-PhePr describes a phenyl 
propyl skeleton attached to the 3-position of a ring. ) 

A dot notation (.) separates the three possible substituents on an 
alkenyl root, the substituent order being same carbon as the -SH 
substituent, then the position trans to the -SH, and finally cis to -SH. 

The above notwithstanding, any name enclosed completely in 
parentheses takes its usual structural meaning. 




Here are structural descriptions for each name abbreviation in the 
above table, mostly in SLN (SYBYL Line Notation), listed 
alphabetically. (SLN extends SMILES with the following concepts, 
among others. Hydrogens are explicit. Ring openings and closures 
begin with a number enclosed by [] and end with the matching 
number preceded by @. Other SLN symbols used in these SLN 
definitions are: ^~ = any bond; - = single bond (used here to provide a 
reference for [R]) : = aromatic bond; I = the SLN following (here in 
parentheses) is not allowed; [F] = no additional atoms may be 
attached to the preceding atom; [!R] = preceding bond may not be in 
a ring; [R] = preceding bond must be in a ring.) 

5het = 5Het = C[13:C:C:C:C:(^1. alkenyl = C=C. alkyl = C~[!R]C. aryl 
Ar = Phe = Ph = C[1]:C:C:C:C:C@1. benzyl = Bz = HSC-[!R]C~[R]C. Bu = 
C-[!R]C-[!R]C-[!R]C-[!R]C. cyclohexyl = Cyhx = C[1](-I=)C~C~C~C~C~@1. 
cyclopentyl = C[1]~(-I=)C~C~C~C~@1. Et = C-[!R]C. inden = 
C[1]:C(~C--X-[2]):C(~@2):C:C@1. iBu = C-[!R]C-[!R]C(-[!R]C)-[!R]C. iPe = C- 
[!R]C-[!R]C-[!R]C(-[!R]C)-[!R]C. Me = C. naphth = 
C[1]:C(~C~X~[2]):C(~@2):C:C:C@1. NoH = !(CH). O denotes ring fusion, 
e.g., benzo fuses a 6-membered aromatic ring. Pe = C-[!R]C-[!R3C- 
[!R]C-[!R]C-[!R]C. Pr = C-[1R]C-[!R]C-[!R]C. R# = alkyl chain of 
approximate length #. Simple = !(C~[!R]C). sPe = C(-[!R]C)-[!R3C-[!R]C 
[!R]C-[!R3C. StUbenyl = C=[!R]C-[!R]C[1]:C:C:C:C:C@1. tBu = C(-[!R]C)(- 
[!R]C>[!R]C. 



